zstd-rs icon indicating copy to clipboard operation
zstd-rs copied to clipboard

`[de]compress_to_buffer` overwrites data in `Vec<u8>` target buffer

Open MMeent opened this issue 3 years ago • 1 comments

Both Compressor::compress_to_buffer and Decompressor::decompress_to_buffer overwrite any pre-existing contents of the provided buffer without this being documented.

This makes it difficult to work with pre-allocated consecutive memory for more than just the (de)compression of one block and makes 0-copy use of zstd-rs in memory-sensitive workloads much more difficult.

MMeent avatar May 24 '22 14:05 MMeent

Documentation could certainly be better for the WriteBuf trait.

Note that for Vec<u8>, the capacity is not changed when using it as a WriteBuf (this means that using Vec::new as a buffer here will never work).

Sounds like what you want is to use a Vec + offset, and only write data after this offset. This is actually very similar to a Cursor<Vec<u8>>, so I think I may add a WriteBuf implementation for that.

It will use as buffer the space between the current cursor location and the end of the vector capacity.

fn compress(compressor: &mut Compressor, source: &[u8], target: &mut Vec<u8>) -> std::io::Result<()> {
  // Where we should start to write.
  let start = target.len();
  let mut cursor = Cursor::new(target);
  cursor.set_position(start);
  compressor.compress_to_buffer(source, &mut cursor)?;
  Ok(())
}

A current workaround would be to implement WriteBuf for your own wrapper around Vec<u8>.

gyscos avatar May 24 '22 15:05 gyscos

Note: zstd-safe now has impl WriteBuf for Cursor<T: WriteBuf> (which includes Cursor<Vec<u8>>).

It will essentially ignore data before the current position.

gyscos avatar Nov 23 '22 14:11 gyscos