bonsaidb icon indicating copy to clipboard operation
bonsaidb copied to clipboard

Implement a schema-id system to make storage and transmission more efficent.

Open ecton opened this issue 4 years ago • 3 comments

Updated after #44.

We now have properly namespaced strings in use everywhere. One issue with using strings for collection IDs is that they're sent across the network. This, at most, should only need 4 bytes to represent if we had a reliable collision-free way to convert these strings to u32s. Originally, the idea was to use a hash. However, there's a more correct way to do it that ensures there will be no collisions:

  • [ ] Create a system in which connected clients cache what IDs are known and what are not, and send the full names as needed to the client, otherwise, send the IDs.
    • [ ] In server:
      • [ ] for requests, names should take an Either<Name,ID>, allowing the client to specify a name when it doesn't have a cached value yet. In the response, the server should track if a client has received a given ID yet, and if not, send the missing mappings before sending the response that uses those IDs.
    • [ ] In client:
      • [ ] a lookup table is established using the values sent from the server. Responses are translated through this lookup table.
      • [ ] When sending a request, see if a name has a cached ID, and if so, use it instead of the name.

ecton avatar Apr 15 '21 18:04 ecton

After going down this rabbit hole, const fn support isn't at the stage where loops can be written. I tried to write a non-loop based hashing function but I couldn't figure out how to iterate over an abritrary string/byte input.

Ultimately this is a lot more complex than I had originally thought. Backup/Restore relies on collections being named with strings, and as I looked at replacing that, I really didn't like the idea.

I'm going to leave this issue titled the way it is currently, but I'm thinking that the ultimate solution here is to solve view-namespacing by including the collection name automatically.

ecton avatar Apr 15 '21 20:04 ecton

After 46f374b215a0822c2c0097a317964667b3687952, this now is purely a network optimization, and as such doesn't need to impact the local crate at all. It can be handled by client/server solely as an implementation detail.

ecton avatar May 11 '21 16:05 ecton

Removed from v0.1.0 because of the previous update: This no longer affects the actual storage, and thus isn't an impediment to stabilizing.

ecton avatar Jul 13 '21 18:07 ecton