arcs
arcs copied to clipboard
Store bigint in a non-numeric-string form in sql
DatabaseImpl.kt currently maps bigint to text for storage and retrieval which will cause the bigints to use lexicographic ordering rather than numeric ordering.
Note, the same behavior currently exists for Instant.
Notes from the last few months of thinking about this and related issues:
- BigInt can't be reliably stored in a finite number of SQL BIGNUMs (longs)
- A variable length type must therefore be used to represent them (VarChar/String or Bytes)
- Storing BigInt as string encoded numerics (i.e. "123") isn't particularly efficient and can't support sorting using SQL primitives
- There's no obvious reason to use a String based encoding if we're not using a human readable encoding
- It seems the best encoding would be a mapping to/from Bytes that supports value based sorting (rather than lexicographic)
The following post outlines a system that might be used as a model for a sort preserving encoding (though mapping it to bytes rather than raw values is an exercise for later).
https://dzone.com/articles/how-to-sort-number-strings-numerically
This should only be done if BigInts are accumulating too much storage overhead, and particularly if there is a noticable performance overhead to sorting +/ filtering BigInts in memory.