histogram icon indicating copy to clipboard operation
histogram copied to clipboard

Implemented a memory-compressed storage

Open HDembinski opened this issue 3 years ago • 1 comments

Sophisticated analyses use high-dimensional histograms, both disc and memory usage can become an issue. Disc usage is easy to fix by writing a compressed stream with Boost.Serialization and Boost.IOStreams. But this does not fix the memory issue.

It would be interesting to add an experimental memory-compressed storage and see how that performs. In-memory compression can be achieved in several ways.

  • Zero-suppression: Some high-dimensional histograms contain a lot of zeros. A simple compression would just replace all the zeros with a code that indicates how many zeros are in the omitted gap. This is comparably easy to implement. Frequent re-allocations of memory in the beginning will be an issue, but that could be optimized over time.
  • We could use an in-memory compressor like Blosc: https://www.blosc.org/pages/blosc-in-depth

HDembinski avatar Mar 24 '21 14:03 HDembinski

BLOSC is BSD licensed, so compatible with BSL.

https://www.blosc.org/pages/blosc-in-depth/ https://github.com/Blosc/c-blosc2

HDembinski avatar Jul 21 '22 10:07 HDembinski