sled icon indicating copy to clipboard operation
sled copied to clipboard

zstd dictionary support

Open spacejam opened this issue 5 years ago • 0 comments

dictionaries allow us to get far higher compression ratios when compressing small data items. they can not be incrementally generated, as far as I know. the same dictionary that was used for compression must be present at decompression time. zstd dictionaries have a max size of 112640 bytes by default, and it's recommended to sample at least 100x this much data before creating the dictionary.

some spitball ideas:

  • generate a unique dictionary per log message type
  • automatically generate a dictionary after N segments, operations, etc...
  • store the dictionary along with the offset that it started being used for compression in the config file
  • allow dictionaries to be generated upon request at any time
  • multiple dictionaries need to be stored as a log, along with the log offset that they began being used for log reservations

spacejam avatar Jul 29 '20 19:07 spacejam