density API issue: density_decompress () requires a larger buffer than needed to store the decompressed data

Currently, density_decompress() is accessing bytes past the end of the unpacked buffer, so it is needed to allocate a buffer of density_decompress_safe_size() bytes. This makes density not as fast and easy to use as some alternatives for some applications. For example, when unpacking several chunks of a large buffer in multiple threads to be placed in one large buffer, it cannot be done in place without additional copying. I understand that access after the end of the unpacked buffer is necessary to avoid additional checks for speed reasons, but you can stop the decompression loop at some distance before reaching the end and continue with a safe version of the algorithm that does not go beyond the unpacked buffer. Thus, you can simplify the use of the API without losing speed.

Jul 19 '21 19:07 Luke546

Hey Luke! Yes that's true although density already uses a safe mode when processing the last parts of a given input buffer. The function density_decompress_safe_size() is necessary though, as it is not possible to know "in advance" the size of decompressed data given a compressed input. In any case, I agree with you that the API could be simpler, if you want to propose a pull request I'd be glad to review it. Thanks !

Jul 20 '21 17:07 g1mv

it is not possible to know "in advance" the size of decompressed data given a compressed input

What if the header of the compressed file included the decompressed file size?

Dec 11 '21 11:12 Crypto-Spartan

Yes that would definitely work of course, initially the library was developed with streams in mind so that encoding and decoding could take place simultaneously over a network for example, however this was dropped in the end but could be studied again. In the meanwhile yes this solution would definitely work, however the user can also use his/her own headers before the density headers to obtain a perfectly-sized decode buffer.

Dec 12 '21 02:12 g1mv