css-inline
css-inline copied to clipboard
Don't convert HTML Entities to Unicode.
I've been using an old Perl css-inliner, which is quite good, but very, very slow (yes, I mean very slow even for pure Perl). I was happy to see this Rust-powered solution, and it is indeed over 100x faster, but has a "feature" that sort of breaks it for me and I don't get the reason. So, given input:
<html>
<head><meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type"></head>
<body>Here’s Johnny</body>
</html>
I get:
<html><head><meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type"></head>
<body>Hereâs Johnny
</body></html>
Which comes up as Hereââ¬â¢s Johnny on the browser.
Now, one could say why use the right single quote (although it is very common for typography reasons in place of apostrophe), but that was just an example, even things like £ get translated to 0xC2 0xA3, which looks quite bad unless your charset is UTF-8. Which is not great if you are trying to inline various things you did not create with that limitation in mind.
I looked in the python wrapper code as I went through that, and I see it is not doing anything special apart from calling the Rust package, and looking into the Rust doc I don't see any control (or mention) of this behaviour, so the issue might be with the Servo components and not with css-inline technically, but I thought I'd ask in case I might be missing something.