postcss-assets
postcss-assets copied to clipboard
Improve SVG encoding in data URIs
First of all, thank you for PostCSS Assets!
I think the encoding of SVGs in data URIs can be improved as described in the blog post Optimizing SVGs in data URIs. The method "Optimized URL-encoded" yields a smaller result than "Fully URL-encoded".
postcss-svgo uses the optimized URL-encoding. (But in my specific use case the SVG files have already been processed by SVGO when I pass them to PostCSS Assets. So ideally I do not want to run them through SVGO a second time.)
@steffenweber thanks, this is a really nice one. I've started to implement the advanced optimization based on that article and the postcss-svgo plugin code: https://github.com/assetsjs/assets/tree/feature/svg-data-optimization
However, for now I would like to omit the quotes optimization, because they require some tricky logic: the quotes in attributes should be forced to be single, while the rest should be kept as they were (so we don’t break possible textual content inside SVG). postcss-svgo acts pretty naive in this case — it just converts every quote, so those textual contents could be spoiled.
I think I'd release a new version without quote optimization and do it later. I would also appreciate any help in implementing it.
Thank you for starting the implementation! There is a small change required to make it work because currently the generated data URIs have a syntax error (Chrome reports net::ERR_INVALID_URL
in the Developer Tools console).
Bad: data:image/svg+xml;…
Good: data:image/svg+xml,…
Alternative: data:image/svg+xml;charset=utf-8,…
The alternative is easier to implement: just prepend 'charset=utf-8,'
to the result of optimizedEncodeUri
in encodeBuffer.js
. The more complete fix would be to change this module's code such that not all data URIs require a charset/encoding (I have not tried to implement that).
Quote optimization: I thought that quotes appearing unencoded as "
instead of as "
in textual contents were a syntax error in SVG. But the W3 validator has no problem with them. :confused:
General approach: For safety reasons, I think it would be better to apply encodeURIComponent
and then undo those substitutions that are known to be safe (whitelist instead of blacklist approach). Like this:
module.exports = function (string) {
return encodeURIComponent(string.trim())
.replace(/%20/g, ' ')
.replace(/%2F/g, '/')
.replace(/%3A/g, ':')
.replace(/%3D/g, '=');
};
Hmm, maybe the proposed optimization is not such a good idea after all.
Permitted characters within a data URI are the ASCII characters for the lowercase and uppercase letters of the modern English alphabet, and the Arabic numerals. Octets represented by any other character must be percent-encoded, as in %26 for an ampersand (&). https://en.wikipedia.org/wiki/Data_URI_scheme
See also: https://perishablepress.com/stop-using-unsafe-characters-in-urls/
What about an option for custom function to encode svg like here?
For example: