ecal icon indicating copy to clipboard operation
ecal copied to clipboard

Support HDF5 compression

Open FlorianReimold opened this issue 2 years ago • 4 comments

Even though HDF5 supports transparent compression using zlib, eCAL cannot utilize that.

  • Windows: We explicitly disabled zlib support, as we are lacking the zlib library
  • Linux: HDF5 should already be compiled with zlib support. We only need the API to turn it on

Steps to take

Prerequisites:

  • [ ] Add zlib as git submodule (or a similar technique). This zlib version will then be used for Windows
  • [ ] Make the main CMakeLists.txt compile zlib and enable it for HDF5. This will probably result in an additional option ECAL_THIRDPARTY_BUILD_ZLIB
  • [ ] Add a zlib compression option to the ecalhdf5 API

eCAL Rec Client:

  • [ ] Add an option in the ecalrec-client-core API for compression
  • [ ] Extend the ecal-rec-client-service protobuf settings to turn compression on and off. The default should be off.

eCAL Rec Server:

  • [ ] Add an option in the ecalrec-server-core API for compression
  • [ ] Extend the ecal-rec-server-service protobuf to turn compression on and off. The default should be off.
  • [ ] Extend the distribution of settings to connected ecal-rec-clients, so the compression setting is included.
  • [ ] Extend the eCAL Rec settings file to save the setting. Also make sure it can be loaded. The settings version should be increased by 1, so old configs get upgraded automatically and old versions of eCAL Rec will show the warning that saving that config with the old eCAL Rec will strip that unknown setting.
  • [ ] Add a GUI option in the eCAL Rec GUI to control that setting
  • [ ] Add a CLI option in the eCAL Rec CLI to control that setting

Samples:

  • [ ] extend the rec service samples, so the compression setting is shown there

GTest:

  • [ ] Write a test for the ecalhdf5 API. The test should at least test that a compressed measurement can be read again. If possible, it should also test that enabling the compression actually results in compressed files.

FlorianReimold avatar Aug 05 '22 07:08 FlorianReimold

This would be a wonderful feature to have! Would that be similar to what rosbag record offers, namely --lz4 and --bz? Hope there is different compression level to choose, so to adapt to different CPU utilisation need :)

chengguizi avatar Sep 20 '22 10:09 chengguizi

We are currently working on this. HDF5 unfortunately does not support LZ4 out of the box, so at first we are going with whatever is supported by default (zlib / gzip). HDF5 does however support custom compression modules, so adding lz4 in the future should be possible and (according to what you read on the internet) may improve runtime performance.

FlorianReimold avatar Sep 20 '22 11:09 FlorianReimold

  • Current code support is
    • Implementation of the API for the HDF5 writer
      • APIs supports two different compression aglorithms Gzip and Szip
      • Following testcases are written to test different compression scenarios (TestSzipCompression, TestGzipAndSzipCompression, TegraAGzipCompression, XavierAGzipCompression, ChunkAPI)
      • Need to add a testcase to test the scenario of reading the compressed file and make sure that is uncompressed correctly
  • Code support ecal recorder server cli (but not tested)
  • Implementation decisions
    • HDF5 API required to apply chunking to be able to use their compression algorithms
    • Since compression overhead can be more than the benefit coming out of compression, approach decided is to have threshold value, above this value we can apply compression

bkhalifaconti avatar Dec 29 '22 09:12 bkhalifaconti

The current version of the code can be found here: https://github.com/eclipse-ecal/ecal/tree/feature/hdf5_compression

FlorianReimold avatar Jan 06 '23 11:01 FlorianReimold