flate2-rs
flate2-rs copied to clipboard
ZlibEncoder doesn't report correct compressed size
The following snippet:
fn main() {
use std::io::Write;
let buf = Vec::new();
let mut file = flate2::write::ZlibEncoder::new(buf, flate2::Compression::best());
file.write_all(b"Hello world!").unwrap();
file.flush().unwrap();
println!("reported size: {}", file.total_out());
println!("actual size: {}", file.finish().unwrap().len());
}
Prints:
reported size: 20
actual size: 26
The 6 missing bytes correspond to the ZLIB header; ZlibEncoder
doesn't take into account its size and only returns the size of the wrapped deflate stream.
If this is the intended behavior, I think it should be clarified in the total_out()
documentation, because this is very unintuitive and unexpected.
Thanks for the report! Is this an issue with all the backends? Or just one? If so it may be a bug for that specific backend.
The bug is reproducible on all 4 backends.
For completeness, I also tested the read
and bufread
encoders, and their total_in
method correctly returns the size including the header.
Hm ok if this reproduces everywhere it may be best to just update the documentation to indicate it doesn't include the 6-byte header.
This means there's an asymmetry between write::ZlibEncoder
and read::ZlibEncoder
though: the write
encoder doesn't include the header, but the read
encoder does!
Personally, I would prefer this to be fixed, but I don't really know what this entails for the implementation, and as long as the current behavior is correctly documented I can live with it.
That's true yeah, if all the backends behave consistently we can work around that in each implementation. Seems reasonable to fix then!
Note: I just found out that my explanation was slightly wrong, as the ZLIB header is only 2 bytes, not 6.
The 6 extra bytes actually correspond to:
- The 2 bytes ZLIB header at the start of the stream
- The 4 bytes CRC checksum at the end of the stream
There zlib wrapper can have further 4 bytes of checksum following the header at the start of the stream if FDICT is used (which is only available with the zlib back-end currently.).