Saving and restoring arenas

Open mux opened this issue 1 year ago • 0 comments

I am currently investigating the possibility of using explicit arenas to isolate specific allocations in memory, dump those to a file and then restore it in another process by mmap()'ing that file. The goal is to optimize application startup time by processing a large set of business data (~90GB) only once, and then distribute those memory files to all application instances. Of course all those machines have a compatible memory layout as they run the same architecture, so similar alignment constraints and endianness. Another unavoidable requirement is for this memory region to be mapped at the same address it lived at in the original process using MAP_FIXED so pointers within this data are still correct, which I am hoping won't be too much of an issue.

By manually creating an arena with arenas.create and then using the resulting index with MALLOCX_ARENA(i) as well as disabling tcache using MALLOCX_TCACHE_NONE, I seem to obtain the desired behavior as far as allocations are concerned. However, I have not yet found a way to obtain the arena base address and size to actually extract the memory corresponding to that arena. Is there a proper way to obtain this information that doesn't involve guesswork? I am currently just remembering the lowest and highest allocated addresses which works - but is not strictly correct in the presence of deallocations. It might be better to look at /proc/<pid>/maps but I have no idea if every arena gets associated to a separate mapping there.

Just to be clear, when I say "restoring arenas", I do not mean that jemalloc should necessarily be able to keep using this mapped arena as if nothing happened. That being said, if it is possible for jemalloc to pick up the metadata corresponding to this arena and keep on working on it transparently using a copy-on-write mapping, that would make things far simpler.

If that is not possible, I intend to have wrappers around allocation functions so free() and realloc() calls are handled correctly by basically ignoring deallocations when they point to data in this memory region and allocating new memory from the heap as usual.

I understand both approaches will inevitably result in some amount of memory fragmentation but that is a cost I am happy to pay in this case.

Besides the question about obtaining the arena address and size, I would be happy to hear opinions from others on whether this approach is sound or that I am overlooking some complications that would require a different solution. I definitely hope this won't involve having to roll my own homemade allocator, or dealing with extent hooks, but I am prepared to do so if necessary.

Thanks in advance!

Feb 25 '25 16:02 mux