slab improve memory efficiency/API

improve memory efficiency/API

Open alkis opened this issue 6 years ago • 15 comments

Currently a Slab<(u64, u64)> uses 24 bytes per element. This overhead comes from Entry being an enum.

To remove this overhead completely, Entry must become a union. But then:

iteration can't be supported efficiently
even without iteration, remove and get become unsafe

If slab keys become a struct with no public fields the interface becomes a bit safer but still not safe. Nevertheless this is probably a good idea anyway.

To remove most of the overhead, the slab can contain a bitmap of occupied/vacant entries instead of using the enum tag. This has some advantages:

iteration is supported and is more efficient for sparse slabs (you can skip a lot of entries by counting zeros/ones)
remove and get are now safe again And disadvantages:
safe operations can potentially exhibit 2 cache misses per op instead of 1 (one in bitmap, one in vector)

What do you think? Does any of these have place in slab? Or do you think these implementation choices deserve their own (different) crate?

Nov 30 '17 22:11 alkis

I would much rather have safety in favor of space concerns.

That said, not much in the way of implementation details is provided in the description, so I'm not sure what any of the specifics would be.

Dec 16 '17 20:12 carllerche

Out of interest, I just quickly did a prototype of this model, and it was substantially slower (3-5x) for my use case. The problem is that finding a spare slot in the bitmap quickly is hard. If you have a slab with lots of entries that is also densely populated, you can end up having to scan a surprisingly large amount of memory to find an entry. So, for me, the memory I saved wasn't worth the performance loss.

So, while I agree that the slab is pretty wasteful of memory (and that definitely hurts me -- roughly, every 4 bytes I save speeds my program up by 5%), I think there's no way in the current model to fix things. The only realistic possibility that I can see is to allow the user object to encode whether it's dead or not, but that won't play well with builtins like ints. So it's a hard problem AFAICS...

Feb 22 '18 20:02 ltratt

If you do the unsafe approach (union of u32 and T) it is super fast. I have this implementation internally and you can't really beat it.

Feb 22 '18 20:02 alkis

Getting rid of iteration in favor of performance is fine, but safety at the API level is pretty important IMO for this lib.

That said, there is probably some middle ground where you can keep safe ops but trim down the internal space.

Feb 22 '18 20:02 carllerche

I think an awful lot depends on the density of the slab. For dense slabs, I'm now fairly sure that the bitmap is worse; for sparse slabs it might win. I don't know which is more common or, indeed, if users would have much of an intuition about which they might want.

Feb 22 '18 21:02 ltratt

At the very least, the overhead should be reducible to at most 8 bytes (depending on alignment issues).

After thinking about it more, I don't know if I have thoughts off the top of my head re: bitmap vs not. @ltratt brings up a good point that it is probably entirely dependent on usage patterns.

Feb 26 '18 18:02 carllerche

I have two suggestions, and one remark.

Replace usize with u32 in VacantEntry, 64-bits is way overkill. In the most generous case where people are for some reason only storing 4-byte objects in the slab, with the overhead of u32 as well you need to dedicate 32 gigabytes of contiguous storage before you run out of address space in our slab. This is plenty, and reducing the overhead by 50% is definitely worth it for that.
Make an unsafe_slab which @alkis refers to with the union of u32 and T. You can't beat that speed. This allows development with slab and in case performance is not sufficient and correctness is verified it allows swapping out slab with unsafe_slab with ease.

A small remark regarding safety: I'm not sure if marking this API safe to begin with is truly in line with what I'd consider safe. Since references can be re-used slab[i] can refer to a whole different object than when I got the reference i to begin with. This is unavoidable of course, but still food for thought.

Jun 24 '18 12:06 orlp

I scrap my previous remark of it being unavoidable - you can have a counter incrementing on each insert, and changing the data layout to store that counter with each element (using a struct instead of an enum). You can re-use that space to store the freelist when that slot is not in use.

Then an index into the slab has to be changed to also contain that counter for the element it's referencing. Then when the counter in the slab != the counter in the index, you know you're pointing to a removed element. This gives substantially better safety - I'm currently working on a prototype.

Jun 30 '18 21:06 orlp

It is important for the slab token to stay usize sized. This is because it is used as the "user data" portion passed to epoll.

Jul 03 '18 15:07 carllerche

I've implemented the ideas I talked about above in the new crate slotmap.

Jul 17 '18 11:07 orlp

I was surprised to see slab used an enum instead of a union and found this issue. Supporting iteration makes sense as the reason, but if you were to want a slab allocator that doesn't support iteration, you could use a Vec<union { T, usize }> without making remove and get unsafe.

The solution is to use a new Tag type in the API, instead of accepting arbitrary usizes. There would be no safe way to construct a Tag from a usize, only to receive one from the allocator when you put something in it, or to unsafely construct if its necessary (e.g. because you roundtripped it through epoll).

I'm not proposing a change to the slab crate; its possible that in slab's intended use cases there is a need to be able to safely check whether an arbitrary usize is a filled entry or not. I'm just leaving notes for anyone who might want to implement a slab allocator with a slightly different strategy.

EDIT: it would also be necessary that Tag not implement Copy or Clone.. holding a tag would mean holding exclusive access to a slot in the slab.

Oct 02 '19 21:10 withoutboats

@withoutboats: right, but how would you handle drop? You would have no way of knowing which slots are filled and which are not. You can’t get that info out of epoll.

I guess the goal or slab isn’t a true allocator but just a safe store... though implementing insertion counters and other such things is on the user.

Oct 02 '19 23:10 carllerche

@carllerche True! I'm currently thinking about dividing up a byte buffer and didn't consider destructors.

For specialized use cases with a known niche (e.g. a slab of boxes) you could walk the insert list in the destructor and null each empty slot. Obviously not suitable for a generic collection like slab, but might possibly apply to how mio is using it?

Oct 02 '19 23:10 withoutboats

@withoutboats

I'm currently thinking about dividing up a byte buffer and didn't consider destructors.

You mean something similar to how Bytes works? Could you elaborate on how you thought slab would fit? I'm not following the use case.

Oct 03 '19 20:10 carllerche

I just mean I was looking at slab as an example of an implementation of a slab allocator. Slab's not relevant for my use case (for which a buddy allocation scheme is more appropriate).

Oct 03 '19 21:10 withoutboats

slab slab copied to clipboard

improve memory efficiency/API

slab
slab copied to clipboard