datacube icon indicating copy to clipboard operation
datacube copied to clipboard

Space efficiency

Open drevell opened this issue 13 years ago • 2 comments

We spend a lot of space on storing row keys. We should figure out a way to reduce this. Possibilities:

  • Combine multiple values under a single key
  • Use variable-width key fields
  • Compact key fields across byte boundaries. Sets one bit at a time in keys instead of a byte at a time.

We'll want to preserve backward compatibility.

drevell avatar Aug 07 '12 18:08 drevell

Do we really though? Won't compression at the block level kick in and effectively remove a lot of the redundancy in keys?

eonnen avatar Aug 07 '12 18:08 eonnen

Yeah, we can assume HBase compression to reduce a lot of the cruft. Theoretically datacube is not HBase-specific though :)

This is something I'll code on my own time rather than UA's time since space consumption isn't a problem for us, and we don't have any plans for non-HBase backends.

drevell avatar Aug 07 '12 18:08 drevell