filo icon indicating copy to clipboard operation
filo copied to clipboard

Implement support for BigDecimal / Arbitrary precision ints/decimals

Open velvia opened this issue 8 years ago • 0 comments

Key questions:

  1. Base type to use
  • Proposal: http://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html
  • java.math.BigDecimal has the best compatibility across the data ecosystem( C*, Spark - which uses a BigDecimal-based variant, others)
  • There is also Scala's BigDecimal, which has nice operators and nice functions for determining if it can fit into long/int/double etc., but has a slight overhead.

BigDecimal has two components: a variable precision unscaled value and a 32-bit scale. 2. Filo binary format

  • Anything that fits in Long/Int/smaller (scale == 0 for all numbers, unscaled value < 64 bits) should use SimplePrimitiveVector / DiffPrimitiveVector for efficient storage.
  • Otherwise, store unscaled values as binary blobs. Most of the time the size of these blobs are probably very similar, so consider something like FixedBinaryVector to save on space (8 bytes per element savings). Store the scale as a SimplePrimitiveVector since most scale numbers are very small and can be stored efficiently.
  • Consider converting what we can to floats/doubles for efficiency -- if no loss when converting back and forth

velvia avatar Dec 17 '15 09:12 velvia