shoco
shoco copied to clipboard
Weissman Score for Short String Compression
There is a benchmark for Compression https://www.wikiwand.com/en/Weissman_score This can be applied to Short Strings too, in comparison to:
- Smaz https://github.com/antirez/smaz and https://github.com/CordySmith/PySmaz
- Unishox https://github.com/siara-cc/Unishox and https://github.com/tweedge/unishox2-py3
- Skrot https://github.com/jstepien/skrot
The equation for a single algorithm: Compression Ratio / Log(Time Required)
Issue: How does one check the speed if the string is short? Solution: Aggregation of results based on compression of multiple short strings
Issue: How can the data be aggregated? A: On a String-to-String basis then average the score B: Concatenate all Strings and their time usage, then calculate the score.
Issue: Using Python to time the speed of each algorithm Solution: Maybe use C instead? Or there is a way to do this with Python?