vite-plugin-optimize-css-modules
vite-plugin-optimize-css-modules copied to clipboard
A different dictionary can trivially improve g-zip compress ratio
Currently, the default dictionary is: _-abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
Based on a character frequency analysis of the code in a few of my projects it should be re-ordered to this: etionraldfps0gx-1chbum4v6w25k9y873zjHCONADLYqBEFGIJKMPQRSTUVWXZ_
For me this measurably improved g-zip compression ratio without any cost. Perhaps a deeper analysis could be done for the most optimal default dictionary, but this is at least a step in the right direction.
Wow, this is super interesting! I definitely haven't thought about that but will for sure have a look at it!
I added benchmarks and ran them against your dictionary, of course this his highly skewed because of me only testing against two libraries at the moment. I could only notice a small improvement in size but a major slowdown in bundle time:
Current dictionary
| Input | Build Time | Gzip Size | Brotli Size |
|---|---|---|---|
| bootstrap-5.0.2.module.css | 525ms (-94.06% / -8311ms) | 21.3 kB (-26.53% / -7.69 kB) | 21.3 kB (-27.54% / -6 kB) |
| materialize-1.0.0.module.css | 572ms (-92.59% / -7156ms) | 20.1 kB (-19.70% / -4.93 kB) | 20.1 kB (-21.33% / -4.3 kB) |
Your dictionary
| Input | Build Time | Gzip Size | Brotli Size |
|---|---|---|---|
| bootstrap-5.0.2.module.css | 1106ms (-88.00% / -8114ms) | 21.3 kB (-26.66% / -7.73 kB) | 21.3 kB (-28.22% / -6.15 kB) |
| materialize-1.0.0.module.css | 1112ms (-87.46% / -7751ms) | 20.1 kB (-19.80% / -4.95 kB) | 20.1 kB (-20.64% / -4.16 kB) |
Do you have any public accessible libraries/css code that I can use to test this against or references on why the order of your dictionary should have a large impact? I can imagine you ordered it based on the frequency of how often each character is used but I'd like to include some references before I make that change :D
I ran a character frequency analysis in my particular project’s css.. I suspect if you do that for these specific projects you’ll observe some real improvement. When I get a chance, I can look into a better general dictionary.
I’m also not really sure why build time would be affected.. that’s strange to me
I just ran it a few more times, seems like my mac was occupied with something else - you're right, the build time is the same.
| Input | Build Time | Gzip Size | Brotli Size |
|---|---|---|---|
| bootstrap-5.0.2.module.css | 550ms (-93.93% / -8515ms) | 21.3 kB (-26.66% / -7.73 kB) | 21.3 kB (-28.22% / -6.15 kB) |
| materialize-1.0.0.module.css | 569ms (-93.29% / -7917ms) | 20.1 kB (-19.80% / -4.95 kB) | 20.1 kB (-20.64% / -4.16 kB) |
I also thought about character frequency, but this would've been needed based on the the CSS that is compiled against which should be possible in the scope of the plugin?! Not sure if it's worth it though or if a general, improved dictionary based on the most frequently used characters in css attributes is more useful.
this is the same question I’m asking. If it’s done in the plugin, we’d have to iterate over the css twice (or perhaps cache some info like character frequency run to run). this would definitely bring the best results, but I think having a better general purpose dictionary may yield results nearly as good and for much easier— I think at minimum we should take that approach first.
Yeah I'd also favor a more general dictionary, if you want you can have a go at this since you already investigated this for your own personal project and open an PR in case you find anything :)