flate2-rs
flate2-rs copied to clipboard
Add example & references in docs to steer people directly to MultiGzDecoder
I think it would help users to have documentation that clearly points people to MultiGzDecoder for decompressing gzipped data. I see two specific places for this:
- On the front page, in addition to the encoding example, give a decoding example.
- In the
GzDecoderdocs, make a clear warning thatGzDecoderonly decodes a single stream, and if users want to reliably decode arbitrary gzipped files, they should look atMultiGzDecoder.
'I have some gzipped data, please decode' seems like a very common use case for flate2 (it was certainly my use case). Right now, it isn't obvious at a quick intro level how to do that.
Sounds plausible to me!
I've been burned by using GzDecoder, not knowing there were multiple members in the file. Even if you know an input is a single member, is there a compelling reason to use GzDecoder over MultiGzDecoder? (Presumably the latter will yield the same results, but is there a perf hit for it?)
There are some obscure cases, but more around when you know that a file is multiple streams. The Alpine .apk format uses multiple .gz streams in a design that is simultaneously clever and mildly horrifying. They tar & compress the package content; then tar & compress metadata, including a checksum of the compressed content stream, as a “tar fragment” (sequence of tar entries without the end-of-tarball indicator) and prepend it to the content; and finally, they sign all of that, pack the signature into a tar entry, gzip it, and prepend it to the whole thing. So you can tar xvzf the entire file and get the content, metadata, and signature, but if you want to work with the file directly and verify its signature, you need to know when the first gzip stream stops.
I don't think doing surgery on Alpine packages is the common use case for flate2.