sled
sled copied to clipboard
zstd dictionary support
dictionaries allow us to get far higher compression ratios when compressing small data items. they can not be incrementally generated, as far as I know. the same dictionary that was used for compression must be present at decompression time. zstd dictionaries have a max size of 112640 bytes by default, and it's recommended to sample at least 100x this much data before creating the dictionary.
some spitball ideas:
- generate a unique dictionary per log message type
- automatically generate a dictionary after N segments, operations, etc...
- store the dictionary along with the offset that it started being used for compression in the config file
- allow dictionaries to be generated upon request at any time
- multiple dictionaries need to be stored as a log, along with the log offset that they began being used for log reservations