Symbol overflow in `BufferBackend` doesn't panic
If I use a BufferBackend<SymbolU16>, the same value can sometimes return different symbols. Neither BufferBackend<SymbolU32> or StringBackend<SymbolU16> have this issue. It's only sometimes. With 12,286 strings, I ended up with between 12,800 and 13,200 symbols, whereas the others always ended up with 12,286 symbols.
Tracking this down further, this appears to actually be the fault of gen_symbol_for. Specifically, I believe my symbols got too big for SymbolU16 due to it being non-sequential. However, the try_from_usize implementation uses as to convert to usize instead of try_into, leading to overflow without hitting the encountered invalid symbol.
Tracking this down further, this appears to actually be the fault of
gen_symbol_for. Specifically, I believe my symbols got too big forSymbolU16due to it being non-sequential. However, thetry_from_usizeimplementation usesasto convert tousizeinstead oftry_into, leading to overflow without hitting theencountered invalid symbol.
Yep this is exactly the issue. I wonder why TryInto was not used - maybe a performance issue? It should be used there in my opinion, so good catch!
The major difference with BufferBackend compared to the others is that indeed the BufferBackend has no contiguous index space. Therefore, you easily run out of symbols, especially when only using SymbolU16. If possible I'd use BufferBackend with usize based symbols. That way you avoid any checks and conversions and it should usually still be really fast.
The thing with using BufferBackend with bigger symbols is that that actually negates all the memory savings that I have from it to begin with. In fact, it uses several MB more.
The thing with using
BufferBackendwith bigger symbols is that that actually negates all the memory savings that I have from it to begin with. In fact, it uses several MB more.
Its all a trade-off. Maybe using u32 would be a nice sweet spot then? That is exactly why string interner provides different backends for different needs.
This several MB more was with u32 actually. The StringBackend is actually the perfect sweet spot. Thanks for giving these different backends!