seize
seize copied to clipboard
Experiment With Amortized Freeing
https://dl.acm.org/doi/pdf/10.1145/3627535.3638491 suggests that batch freeing bypasses thread-local allocator buffers, and freeing from a remote thread, which is extremely expensive (note that mimalloc avoids this problems, but about every other allocator is affected). Amortized freeing can improve both latency as well as throughput.
I experimented a little bit with this in the pool branch, along with a try_steal method that enables reusing pointers that have not been deallocated. Unfortunately, I'm not seeing much benefit without aggressive allocation reuse, and it does make the fast path for dropping a guard more expensive.
note that mimalloc avoids this problems, but about every other allocator is affected
FWIW, I recently discovered https://github.com/microsoft/snmalloc, so I figured I'd mention that it's one more allocator with with cheap freeing from a remote thread as a design goal :)
Rethinking this, it is still a useful configuration method as currently, dropping a guard can lead to unexpected latency spikes, even with the default batch size. Unfortunately exposing this conditionally without affecting performance in the general case is hard..