shoco icon indicating copy to clipboard operation
shoco copied to clipboard

Weissman Score for Short String Compression

Open BradKML opened this issue 3 years ago • 0 comments

There is a benchmark for Compression https://www.wikiwand.com/en/Weissman_score This can be applied to Short Strings too, in comparison to:

  • Smaz https://github.com/antirez/smaz and https://github.com/CordySmith/PySmaz
  • Unishox https://github.com/siara-cc/Unishox and https://github.com/tweedge/unishox2-py3
  • Skrot https://github.com/jstepien/skrot

The equation for a single algorithm: Compression Ratio / Log(Time Required)

Issue: How does one check the speed if the string is short? Solution: Aggregation of results based on compression of multiple short strings

Issue: How can the data be aggregated? A: On a String-to-String basis then average the score B: Concatenate all Strings and their time usage, then calculate the score.

Issue: Using Python to time the speed of each algorithm Solution: Maybe use C instead? Or there is a way to do this with Python?

BradKML avatar Aug 27 '21 04:08 BradKML