binjs-ref icon indicating copy to clipboard operation
binjs-ref copied to clipboard

Test different sortings for `[STRINGS]`

Open Yoric opened this issue 6 years ago • 2 comments

We currently sort [STRINGS] from most used to least used. This could interfere with compression. Let's try and see if we get better results by storing them:

  • by lexicographical order;
  • by lexicographical order right-to-left.

Yoric avatar Apr 18 '18 17:04 Yoric

Brotli generally does well when it can make a copy from a short distance away. Sorting is a decent heuristic because it puts prefixes together. You might wring a bit more benefit out of sorting by bucketing strings into "short" and "long" and then sorting those.

dominiccooney avatar May 11 '18 03:05 dominiccooney

Experiments indicate that we can gain ~2% on samples, by doing such changes, but changing sorting order that helps some samples hurts others. To be continued...

Yoric avatar May 11 '18 05:05 Yoric