image-gif icon indicating copy to clipboard operation
image-gif copied to clipboard

Decompressions requires an unecessary intermediate buffer

Open HeroicKatora opened this issue 4 years ago • 0 comments

A write-up of one potential performance problem and improvements.

The decompression of blocks happens one sub-block at a time, or less if the BufRead buffer does not cover the whole block. The decoder then uses a temporary buffer StreamingDecoder::decode_buffer and returns a view into it after the decoding was successful. In the next step the colors from this buffer are expanded. In the interlaced case we loop over the output several times and copy data from previous interlace passes to yet another buffer in Decoder since we can not keep the borrow from StreamingDecoder long enough.

https://github.com/image-rs/image-gif/blob/master/src/reader/decoder.rs#L191-L192

https://github.com/image-rs/image-gif/blob/master/src/reader/mod.rs#L303-L366

https://github.com/image-rs/image-gif/blob/master/src/reader/mod.rs#L163

Note that even in the non-interlaced case this is wasteful. We only ever expand colors. Thus we could decompress directly into the output and then expand them from that slice, saving a memory allocation. For the interlaced case this doesn't work as the target pixels span the whole buffer. Still, this does not appear optimal. The strategy decompresses one sub-block at a time and immediately expands it. Those are at most 1 << 8 bytes to the decompression. While the lzw stage could turn them into roughly 1 << 20 bytes of raw data, that is quite uncommon and can only happen after 1 << 14 bytes of input. In most cases the compression ratio will be significantly worse, for example ~20-30%. Then this only gives us 1 << 10 bytes of output to chow through at once. This is far from efficient for even memcpy in particular since we do not ensure that the output is particularly aligned.

HeroicKatora avatar Oct 18 '20 14:10 HeroicKatora