zstd
zstd copied to clipboard
Support limiting the frame size via zstd command line tool
Is your feature request related to a problem? Please describe. We use zstd to compress log files which are then later read and replayed. Due to the fact the files are pretty large (maybe even larger then the replay systems available memory when replaying multiple of them), in our writer we currently flush (force end frame) after reaching a uncompressed input size using the zstd API. This partions our files into frames of 16MB ucompressed size. The loader can then load the file frame by frame, dropping the data from the previous frame when it is no longer needed. It also can search for a given point in in time in our ordered input by implementing a binary search within the undecompressed frames, searching and finding the next location to jump to, locating that particular frame, decompressing that frame and so on. By only loosing very little compression ratio the memory consumption of random access jumps can be reduced to a fixed amount and the speed increased drastically. There is one problem though. Sometimes data is written in uncompressed form and one wants to compress the data afterwards. This is convieniently done using the zstd command line tool which sadly then compresses all my data into to a huge frame, which then renders my binary search useless.
Describe the solution you'd like An option for the command line tool --max-frame-size=X which allows limiting the output frame size to X. I don't care whether this is a limit to the compressed frame size or the uncompressed frame size, as I can adjust X accordingly.
Describe alternatives you've considered We could write our own zstd command line tool which does this, which we rather want to avoid. I tried different tricks with the streaming API, which didn't work.
Additional context The zstd API allows me to force the end of a frame, which I use in my application to write frames of certain size by counting uncompressed input and forcing this.