bonsaidb
bonsaidb copied to clipboard
Implement a schema-id system to make storage and transmission more efficent.
Updated after #44.
We now have properly namespaced strings in use everywhere. One issue with using strings for collection IDs is that they're sent across the network. This, at most, should only need 4 bytes to represent if we had a reliable collision-free way to convert these strings to u32s. Originally, the idea was to use a hash. However, there's a more correct way to do it that ensures there will be no collisions:
- [ ] Create a system in which connected clients cache what IDs are known and what are not, and send the full names as needed to the client, otherwise, send the IDs.
- [ ] In server:
- [ ] for requests, names should take an Either<Name,ID>, allowing the client to specify a name when it doesn't have a cached value yet. In the response, the server should track if a client has received a given ID yet, and if not, send the missing mappings before sending the response that uses those IDs.
- [ ] In client:
- [ ] a lookup table is established using the values sent from the server. Responses are translated through this lookup table.
- [ ] When sending a request, see if a name has a cached ID, and if so, use it instead of the name.
- [ ] In server:
After going down this rabbit hole, const fn support isn't at the stage where loops can be written. I tried to write a non-loop based hashing function but I couldn't figure out how to iterate over an abritrary string/byte input.
Ultimately this is a lot more complex than I had originally thought. Backup/Restore relies on collections being named with strings, and as I looked at replacing that, I really didn't like the idea.
I'm going to leave this issue titled the way it is currently, but I'm thinking that the ultimate solution here is to solve view-namespacing by including the collection name automatically.
After 46f374b215a0822c2c0097a317964667b3687952, this now is purely a network optimization, and as such doesn't need to impact the local crate at all. It can be handled by client/server solely as an implementation detail.
Removed from v0.1.0 because of the previous update: This no longer affects the actual storage, and thus isn't an impediment to stabilizing.