Support for localised heaps
I'm integrating with a tracing garbage collector, but I have many heaps (i've got erlang-style processes each with their own heap). As I understand it from the documentation, this is not currently possible.
Is there anything I can do to help with the support for this?
Hi @jjl,
The documentation states that std::allocator like API's are not supported. It make sense in the long run to support them, but this is non-trivial work. Before I get into listing the challenges involved, let's analyse your problem a bit further.
Can you send me a link to the documentation of the garbage collector you are trying to integrate?
You say you these are per-process allocators. How is data transferred between processes? Can I copy a pointer of a block allocated in process A from the stack of process B, even when there are no references to it from process A? Likewise, what about blocks allocated from process B referencing blocks from process A?
Also, you probably have an API to get the current process id and its corresponding allocator, right? If so, you should be fine with the current API. Let me know a bit more about the specifics of this API and I can help you out.
Hi
The garbage collector isn't one of the standard implementations because i'm being an awkward bugger^W^W^W^Wpushing boundaries again...
So my processes look something like this:
class heap {
void *allocate(size);
};
class process {
heap data;
};
The heap is actually a std::vector of bytes and when you call allocate it will add on the appropriate space to hold a heap item header (which is length and some tag bits). The gc itself is a simple mark/sweep stop-the-world precise collector I threw together in a few minutes
When one process sends a message to another process, space is allocated on the receiving process' heap and the message copied into it. In erlang, if i were to send a map to another process, it would recursively copy the whole thing over to that process's heap. I do have some plans to limit the cost of this in my implementation that involve limited sharing, but thought i'd solve the simple problem before I solve the hard one ;)
Really, what I want is to be able to pass in an allocate() function which will be used to grab memory when it needs it and throw an exception if it fails because it couldn't allocate memory (so i can trigger a garbage collect).
I've probably hurrendously misunderstood something because my c++ is terrible and immer is pretty fancy, but the only way I could see to do this would be to have a thread local indicating what the current thread is and a collector that instead of maintaining state locally just maintained it globally, or rather thread-locally. Still, it sounds pretty horrible, so maybe you have a better idea?
Cheers, James P.S. I loved the paper and the talk, great work!
I've probably hurrendously misunderstood something because my c++ is terrible and immer is pretty fancy, but the only way I could see to do this would be to have a thread local indicating what the current thread is and a collector that instead of maintaining state locally just maintained it globally, or rather thread-locally. Still, it sounds pretty horrible, so maybe you have a better idea?
Well, I think you already pointed out in the direction of the solution... if your processes are backed by actual processes you can use a global. Otherwise if you they are backed by actual threads you can use a C++11 thread_local variable to keep a reference to the allocator associated to current thread. This seems to me like solution with the least overhead and potentially cleaner.
For a usage of thread locals in the context of allocators you can look at the thread_local_free_list_heap adaptor. But of course if your processes are not backed by actual threads and you are building your own "green thread" infrastructure you might need to reinvent thread local in your system.
What do you find it horrible about it? Everyone in the current process/thread/fiber will want to allocate memory. It seems to me natural to make it available directly to everyone in that scope.
Alternatively, we change the immer::vector API to support allocators with data. One good way would be to use a std::allocator API---in this way we also gain compatiblity with the other allocator libraries in the wild. In that case when you could model your allocator after the Allocator concept in the standard [1]. This is the equivalent of:
Really, what I want is to be able to pass in an allocate() function which will be used to grab memory when it needs it and throw an exception if it fails because it couldn't allocate memory (so i can trigger a garbage collect).
When you construct a vector you would need to pass to it the allocator associated to the current process (which is probably just a pointer to the actual heap where it allocates). Note that every vector would need to carry this pointer to be able to allocate. Personally I find this maybe ackward for the user and it might be error prone (what if you pass the wrong alocator, now you need to also recursively make sure you change the allocator in the compies when passing things with vectors inside in a message to another process, etc.) but if you are just hidding this in the runtime of some language you are building or something maybe it is fine...?
Thinking about the design in general, it would be nice to support passing vectors as messages without copying the data. This is one of the big wins of immutable data structures. But I understand that it complicates the allocation/garbage collection Scheme...
[1] http://en.cppreference.com/w/cpp/concept/Allocator
Sorry, I didn't explain myself very well. For purposes of this, I'm writing an erlang implementation (It isn't quite, but close enough). An erlang VM typically has:
- an OS thread per cpu core
- many green threads being scheduled across them (these are the "process" i've been talking about, sorry for the confusing erlang terminology)
- only one OS thread running a given green thread at a time
The idea I had was that each os thread could maintain a thread local pointing to the current pid, which points to the heap. my allocator implementation could just modify that when it switches greenthreads, getting around the state thing - they would all share an allocator and the state would change underneath it.
I totally agree that it would be nice to be able to reduce the copying. I've got some plans to help with that in the future, but the reason I've gone with the erlang model as a basis is that one greenthread can gc and it only ties up a single core (for less time, because the heaps are much smaller). Once i've integrated erlang's crash handling mechanism (on green thread crash, that local heap is reclaimed), I'll look into something involving refcounting and shared heaps to see how it works out, but i need more of the infrastructure in place to be able to start benchmarking that and as you say, it complicates the problem somewhat...
Cheers, James
Yes I think that setting a thread_local pid as part of the green thread context switch is the best way to go. To support shallow copying across processes actually we might need to add stateful allocators (but maybe making a refcount policy that cooperates with the allocator and understands its headers works better).
Sounds like a cool project indeed! Ping me eventually when you have something to show 😄