Keka
Keka copied to clipboard
What is the reason my old Keka 1.0.4 created mostly binary data p7zip solid archives with 50% less in size
Updated my OSX from Mavericks to Catalina a few days ago and thought it would be nice to update my toolset aswell. Using keka 1.2.12 with 7z (-solid, -excl.ress forks -best compression) results in archives twice the size of my old keka 1.0.4 on mavericks.
I like the task queueing of the recent keka, so i replaced (keka7z/7z.so) with my old keka binaries from 1.0.4 in order to get the results I am used to while keeping queue functionality.
Please give me some insight what could casue this behaviour...
Thank you!
...You should test with copies of almost identical binaries, there must be something wrong with the p7zip SOLID mode
arguments: -t7z -mx9 -ms=on
@andyrobik can you share a file that produces this difference in size when compressed? I can't reproduce your issue.
I've tested compressing Keka.app (which contains multiple binaries) with almost negligible (0,004%) size difference.
Just occurred to me that maybe you were using ZIP
instead of 7Z
format in the newer version?
I'm pretty sure the problem isnt keka but p7zip itself - I will share some examples and binaries to test with if youre still interested, as this is really intriguing. Can I PM you here in order so send the test files?
There's no PM here but you can get in touch via mail at [email protected] :)
@andyrobik thanks for the files. I'll be testing them and see what can we do about this one.
i thank you
Already contacted via mail, but this issue is caused because of the sorting system used by p7zip
. You can see in 7-Zip's FAQ:
You can get big difference in compression ratio for different sorting methods, if dictionary size is smaller than total size of files. If there are similar files in different folders, the sorting "by type" can provide better compression ratio in some cases.
Also in this case the use of BCJ2
filter resulted in worse ratio, and using LZMA2
instead of LZMA
too, although slightly and with a speed penalty.
Will need to think how to implement this, but meanwhile here are two builds:
-
Keka-QS+LZMA: Using sorting by type and no filter when
solid
if selected -
Keka-QS+LZMA: Using sorting by type, no filter and LZMA when
solid
is selected
"Keka-QS+LZMA" is my fave, it takes more than double the time (150%) to compress but utilizes less cpu time and gives better compression results ("Keka-QS" brawled at 75% with howling fans, archive was 20% bigger). It would be nice if there was a way to get the best compression options automagically. you just say e.g. i want the archive to be as small as possible vs i dont care if the result is like a zip ;) as long it is fast - a not too comprehensive set of sliders to tune it. A look into the p7zip docs gives me a headache. With the defaults p7zip behaves, I personally dont see a big advantage over zip/rar right now for binary data compression, so i will stick with my personal "Keka-QS+LZMA" version - thank you!
BTW you made a typo in the above post, naming both version the same.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.