RFCs icon indicating copy to clipboard operation
RFCs copied to clipboard

GC proposal

Open craff opened this issue 1 year ago • 5 comments

craff avatar Nov 29 '24 18:11 craff

@damiendoligez Implemented various approaches to aging in the past, which is what your first suggestion is. They always turned out to perform worse than the current approach because the extra book-keeping dominated the improvement in promotion rate. So eventually we gave up on them.

lpw25 avatar Dec 01 '24 09:12 lpw25

This proposal was harder to read than it could have been because some terminology is used in a non-standard way. What's called the "grayval list" here is usually known as the remembered set. The grey values or mark stack is a different data structure, used in major GC marking and not relevant for the minor GC. There's an excellent glossary at memorymanagement.org which defines all the relevant terms.

For the actual second suggestion, I'm afraid that incrementing and decrementing a ref counter in the minor heap is likely to be more expensive than maintaining a remembered set in the way it is currently done, at least with multiple domains. The remembered set is per-domain so can be accessed directly, while the counters are shared so must use slower atomic accesses and will cause cache contention between domains.

stedolan avatar Dec 02 '24 10:12 stedolan

This proposal was harder to read than it could have been because some terminology is used in a non-standard way. What's called the "grayval list" here is usually known as the remembered set. The grey values or mark stack is a different data structure, used in major GC marking and not relevant for the minor GC. There's an excellent glossary at memorymanagement.org which defines all the relevant terms.

For the actual second suggestion, I'm afraid that incrementing and decrementing a ref counter in the minor heap is likely to be more expensive than maintaining a remembered set in the way it is currently done, at least with multiple domains. The remembered set is per-domain so can be accessed directly, while the counters are shared so must use slower atomic accesses and will cause cache contention between domains.

Thanks for the remark, I updated the term.

In my mind the ref counter is next to the block in the minor heap itself. And incrementing/decrementing does not require a mutex, if it is done atomically. Last time I looked, reference from major head to minor heap where kept in a doubly linked list ? Was this changed ?

craff avatar Dec 03 '24 17:12 craff

@damiendoligez Implemented various approaches to aging in the past, which is what your first suggestion is. They always turned out to perform worse than the current approach because the extra book-keeping dominated the improvement in promotion rate. So eventually we gave up on them.

It is not only a problem a speed when we compare equal minor heap size in the two approach. I would like to have a 16Mb (even more) minor heap in simple_httpd with 64Kb-256Kb slices to keep good latencies. When I see the memory consumption of simple_httpd compared to apache, I can afford very large minor heaps except in the current version latency is too much impacted.

I think this approach could work.

craff avatar Dec 03 '24 17:12 craff

In my mind the ref counter is next to the block in the minor heap itself. And incrementing/decrementing does not require a mutex, if it is done atomically. Last time I looked, reference from major head to minor heap where kept in a doubly linked list ? Was this changed ?

No, I don't think it was ever a doubly linked list? It's a flat per-domain array, which is cheaper than atomic operations that may contend.

It is not only a problem a speed when we compare equal minor heap size in the two approach. I would like to have a 16Mb (even more) minor heap in simple_httpd with 64Kb-256Kb slices to keep good latencies. When I see the memory consumption of simple_httpd compared to apache, I can afford very large minor heaps except in the current version latency is too much impacted.

I think this approach could work.

You'll need to explain why you think this approach could work. The description so far sounds similar to things that have been tried before and did not improve performance.

In particular, your idea for representing the remembered set sounds like it involves scanning the whole minor heap at collection time to find the objects that must be preserved. That alone seems more expensive than the current GC, which touches only the live data.

stedolan avatar Dec 16 '24 11:12 stedolan