emogrifier icon indicating copy to clipboard operation
emogrifier copied to clipboard

Refactor the CSS parsing

Open oliverklee opened this issue 7 years ago • 5 comments

We should move the CSS parsing either to a separate class or use an existing package for this, e.g. one of these:

  • https://packagist.org/packages/sabberworm/php-css-parser
  • https://packagist.org/packages/soloproyectos-php/css-parser

oliverklee avatar Apr 02 '18 13:04 oliverklee

@JakeQZ Do you have experience with CSS parsing libraries and can recommend one?

oliverklee avatar Apr 02 '18 18:04 oliverklee

I've looked at a few PHP CSS minification tools (one of which I use), and they've all used their own regex parsing. I'm not familiar with either of the above.

JakeQZ avatar Apr 02 '18 18:04 JakeQZ

I've read the readmes of the two suggested possibilities, but have not looked in any detail...

* https://packagist.org/packages/sabberworm/php-css-parser

Looks like it would parse the CSS and organize into an OO data structure along the lines of the W3C spec. Seems to misname 'Declaration' as 'Rule', but the principle and organization appear correct, so that should be a non-issue. Property values have their own classes (like URL, Color) - we really only care about the string (provided the !important flag is separated), but hopefully that won't be an issue. Actively maintained. Looks worth investigating further.

* https://packagist.org/packages/soloproyectos-php/css-parser

Looks a bit like a jQuery-like extension for PHP DOM, rather than a CSS parser. Doesn't appear to have been updated for 4 years. At first glance, doesn't look applicable/useful to Emogrifier.

JakeQZ avatar Sep 02 '19 00:09 JakeQZ

* https://packagist.org/packages/sabberworm/php-css-parser

Looks like it would parse the CSS and organize into an OO data structure along the lines of the W3C spec.

I've started to look at this in more detail, and begun trying out some changes to our implementation to use it. So far so good.

I think as a first step, we could use this to parse the raw CSS, then collate the data structure returned and put into the structure we currently use internally, as is returned by CssInliner::parseCssRules.

So, initially, we would use this as far as CssInliner::parseCssRules, and that could make for a self-contained improvement. Moving forward from there, we can then look at propagating the Sabberworm data structures and objects further along the stream to replace the associative arrays we currently use...

JakeQZ avatar Jun 18 '20 16:06 JakeQZ

Having subconsciously thought about https://github.com/MyIntervals/emogrifier/pull/912#issuecomment-653879492

move the CSS parsing either to a separate class or use an existing package

for a while, I think we should probably do both:

orand then

Refactoring first. Maybe add a CssParser class that can take as input some CSS, and spit out the arrays of associative/object-like arrays that are used internally by CssInliner. Probably up to but not including separating 'inlinable' and 'uninlinable' which is implementation-specific (though I'm not expecting other implementations, it's still preferable to make logical separations in the design).

Several steps later, if we do use a 3rd party lib (like sabberworm/php-css-parser) for the meat and bones, the CssParser class that was added can perhaps be obsoleted/retired, having been a useful stepping stone for incremental changes on the way.

JakeQZ avatar Nov 28 '20 23:11 JakeQZ