emogrifier
emogrifier copied to clipboard
Refactor the CSS parsing
We should move the CSS parsing either to a separate class or use an existing package for this, e.g. one of these:
- https://packagist.org/packages/sabberworm/php-css-parser
- https://packagist.org/packages/soloproyectos-php/css-parser
@JakeQZ Do you have experience with CSS parsing libraries and can recommend one?
I've looked at a few PHP CSS minification tools (one of which I use), and they've all used their own regex parsing. I'm not familiar with either of the above.
I've read the readmes of the two suggested possibilities, but have not looked in any detail...
* https://packagist.org/packages/sabberworm/php-css-parser
Looks like it would parse the CSS and organize into an OO data structure along the lines of the W3C spec. Seems to misname 'Declaration' as 'Rule', but the principle and organization appear correct, so that should be a non-issue. Property values have their own classes (like URL, Color) - we really only care about the string (provided the !important flag is separated), but hopefully that won't be an issue. Actively maintained. Looks worth investigating further.
* https://packagist.org/packages/soloproyectos-php/css-parser
Looks a bit like a jQuery-like extension for PHP DOM, rather than a CSS parser. Doesn't appear to have been updated for 4 years. At first glance, doesn't look applicable/useful to Emogrifier.
* https://packagist.org/packages/sabberworm/php-css-parserLooks like it would parse the CSS and organize into an OO data structure along the lines of the W3C spec.
I've started to look at this in more detail, and begun trying out some changes to our implementation to use it. So far so good.
I think as a first step, we could use this to parse the raw CSS, then collate the data structure returned and put into the structure we currently use internally, as is returned by CssInliner::parseCssRules.
So, initially, we would use this as far as CssInliner::parseCssRules, and that could make for a self-contained improvement. Moving forward from there, we can then look at propagating the Sabberworm data structures and objects further along the stream to replace the associative arrays we currently use...
Having subconsciously thought about https://github.com/MyIntervals/emogrifier/pull/912#issuecomment-653879492
move the CSS parsing either to a separate class or use an existing package
for a while, I think we should probably do both:
orand then
Refactoring first. Maybe add a CssParser class that can take as input some CSS, and spit out the arrays of associative/object-like arrays that are used internally by CssInliner. Probably up to but not including separating 'inlinable' and 'uninlinable' which is implementation-specific (though I'm not expecting other implementations, it's still preferable to make logical separations in the design).
Several steps later, if we do use a 3rd party lib (like sabberworm/php-css-parser) for the meat and bones, the CssParser class that was added can perhaps be obsoleted/retired, having been a useful stepping stone for incremental changes on the way.