FPC icon indicating copy to clipboard operation
FPC copied to clipboard

Sparse files not compressing well

Open clbr opened this issue 4 years ago • 2 comments

Thanks for making this. I tested it on a set of binary files, and while it beat FSE on 1/8 of them, for the rest FSE did better. Seems the ones where FSE won by a large margin are sparse.

I configured FPC for 1 stream and an adaptive step of 128.

Here are some sample files, if you're interested. Released to public domain. https://files.catbox.moe/pbqhrb.tgz

PS: FPC did get some large wins over FSE too. By trying both, brute force style, and using the better one, the total file size for the set decreased 1.6%.

clbr avatar Jul 13 '20 13:07 clbr

Hello, thank you for using FPC.

For this kind of files, It is better to use some kind of RLE first and then entropy code. FSE can win here because it uses fractional bits. But on the other hand has bigger header.

kagiannis avatar Jul 13 '20 13:07 kagiannis

LZ4+FPC still loses to FSE on these files. Not sure a homegrown RLE would do any better.

clbr avatar Jul 14 '20 05:07 clbr