shredder
shredder copied to clipboard
Reduce number of allocations when creating a Gc<T>
Currently it seems that allocating a Gc<T> requires a whopping 4 heap allocations, of the T itself, a GcHandle, a GcData and a Lockout.
There seems to be no reason why a design can't be done with just 1 allocation and with a 1-pointer wide Gc<T>, by removing GcHandle and using Arc<GcData> directly instead of Arc<GcHandle> pointing to Arc<GcData>, and putting the Lockout and data inside the GcData.
As you can probably tell, I optimized for ergonomics, then collection performance, and have barely looked at optimizing memory overhead.
I'm not sure your idea works. We need to track handles with unique identifiers, which means that we must allocate twice. (Once for the handle, and once for the data.) Gc<T>
can't be one pointer wide because it wants a direct pointer to the data (for performance reasons), and it needs to know what handle it is associated with.
Making a GcData<T>
could technically only allocate once, if we inlined the T
into the GcData
The allocation for the Lockout could definitely be removed (just store it inline in the GcData
), but it'd make the code bit grosser.
I may break this into a bunch of issues, since this is touching on a lot of different problems with the code. This is a very new library after all :)
Okay we're down to 3 allocations after #34. I think we can go down to two, but it's slightly tricky because of how deallocating currently works. I think removing that allocation should be done in conjunction with #23, because it will involve a refactoring of the deallocation logic.
Also I'm not sure this is actually the slowest part of Gc::new
. I would expect inserting into the concurrent hash tables is the slowest part.
Still tracking this. I think that this is probably more of an issue now that I got rid of the concurrent hash tables (since the speed of Gc::new
is now probably more tied up to these two allocations).
Another complicating factor is the new support for unsized data. I found that compelling, but it complicates this since embedding unsized data in the GcData
would be really really annoying
Is it still 3 allocations? I would like to give it a try and work with #23 at the same time. Is there any suggestion I could get some insights first?