zstd icon indicating copy to clipboard operation
zstd copied to clipboard

Streaming compression in exact compressed size chunks

Open yeenow123 opened this issue 9 months ago • 0 comments

Describe the bug Hello, I'm not sure whether this is a bug, the wrong API to use, user error, or just simply not possible.

My use case is to perform streaming compression on an arbitrarily large binary input file into exact compressed chunks without reading the full file into memory.

To Reproduce

Using zstd v1.5.2 (happy to upgrade if this is the reason)

Some steps (happy to provide some code if needed):

  1. Create an output buffer of chunk size (8MiB)
  2. Read 8MiB from the file into an input buffer and pass it to ZSTD_compressStream2 repeatedly until the output buffer is full (i.e. output.pos == output.size)
  3. Perform arbitrary logic on the output buffer chunk of 8MiB
  4. Go back to step 2 (including handling anything left in the input buffer) until we're at EOF

The compression seems to work fine, however when I decompress it, the files sizes are the same however the file contents are not identical. Both cmp and diff report differences in the output.

Expected behavior When I allocate an output buffer larger than the expected compressed size of the file, this works just fine.

Another way of implementing this would be compressing until I hit a compressed size > chunk size and handle the alignments, but I was hoping not to have to do that.

yeenow123 avatar Mar 31 '25 23:03 yeenow123