arcs icon indicating copy to clipboard operation
arcs copied to clipboard

Store bigint in a non-numeric-string form in sql

Open Cypher1 opened this issue 5 years ago • 2 comments

DatabaseImpl.kt currently maps bigint to text for storage and retrieval which will cause the bigints to use lexicographic ordering rather than numeric ordering.

Cypher1 avatar Aug 05 '20 00:08 Cypher1

Note, the same behavior currently exists for Instant.

Cypher1 avatar Aug 05 '20 05:08 Cypher1

Notes from the last few months of thinking about this and related issues:

  • BigInt can't be reliably stored in a finite number of SQL BIGNUMs (longs)
  • A variable length type must therefore be used to represent them (VarChar/String or Bytes)
  • Storing BigInt as string encoded numerics (i.e. "123") isn't particularly efficient and can't support sorting using SQL primitives
  • There's no obvious reason to use a String based encoding if we're not using a human readable encoding
  • It seems the best encoding would be a mapping to/from Bytes that supports value based sorting (rather than lexicographic)

The following post outlines a system that might be used as a model for a sort preserving encoding (though mapping it to bytes rather than raw values is an exercise for later).

https://dzone.com/articles/how-to-sort-number-strings-numerically

This should only be done if BigInts are accumulating too much storage overhead, and particularly if there is a noticable performance overhead to sorting +/ filtering BigInts in memory.

Cypher1 avatar Feb 18 '21 00:02 Cypher1