dompdf icon indicating copy to clipboard operation
dompdf copied to clipboard

Use proper CSS parser

Open samuelvogel opened this issue 12 years ago • 7 comments

The current CSS parser seems very hackish (alot of regular expressions, trims and the like). Why not use a proper CSS parser like https://github.com/sabberworm/PHP-CSS-Parser? This would solve a lot of problems, as the current parser does not properly support minified CSS (for us it breaks on one-line @page and font-face directives)

samuelvogel avatar Aug 15 '13 10:08 samuelvogel

Now don't be cruel. That project also uses RegEx, trim, and the like! When you're parsing strings you use the best tool for the current task. RegEx can be a great tool for string parsing so long as you can deal with two problems instead of one.

We have actually talked about using a third-party parser in the past. dompdf has a lot of baggage since the code was originally written many years ago (beginning in 2005). Our code may be more old-school monolithic, but it's fairly robust.

Still, it doesn't hurt to look at other options. So we'll keep this one on the table.

@sabberworm's code is definitely better engineered, but it would require some effort to implement in dompdf. The most recent version of the code uses namespaces, which means he's targeting PHP 5.3 or newer. Though we may move in that direction for the post-0.6 release we're still supporting PHP 5.2.x. That doesn't preclude us from using an earlier version of the code, but that also means we get no updates.

All being said, we have some other serious issues we want to resolve before we look at the CSS parser. In the meantime you can post bug reports resulting from the current parser and we'll see how much work would have to be put in to patch them.

bsweeney avatar Aug 16 '13 02:08 bsweeney

Yes you are right, but PHP-CSS-Parser just looks so much cleaner. I did have a look into Stylesheet->_parse_css() and thought that it was easy to patch at first, but after having a look at the rest of the Stylesheet class I realized not so much...

samuelvogel avatar Aug 16 '13 08:08 samuelvogel

Perhaps. The difficulty should be in the teasing out of the styles. There's only so much you can do to minimize CSS and we should be able to handle most of those techniques. If you have some CSS you want us to test feel free to post it.

bsweeney avatar Aug 16 '13 14:08 bsweeney

Sure, I did so here #699. It would be great if you could have a look at it!

samuelvogel avatar Aug 16 '13 14:08 samuelvogel

That is something I was looking at a few months ago, @sabberworm's project was my first choice, but I saw another similar project: https://github.com/ju1ius/css-parser, even if it is less active, the code base is pretty impressive. I tried to quickly integrate both of them, but couldn't find time to finish. I think this will be part of the 0.7 release, not before, for the reasons @bsweeney mentionned.

PhenX avatar Aug 18 '13 15:08 PhenX

"we're still supporting PHP 5.2.x" Seriously? The latest version of PHP 5.2 was released in December 2010 and is unsupported since then. If the "code is definitely better engineered" and makes for a better end product, then why support a PHP version that is no longer supported by the company that makes it?

Ugoku avatar Sep 03 '13 08:09 Ugoku

@Ugoku the release currently in development was started while PHP 5.2.x was still actively developed. It doesn't make much sense to restart development now just to support PHP 5.3+. Plus, PHP 5.2.x still appears to have a fairly large install base according to W3Techs.

We are a small team with limited resources (namely time) and so must dedicate those resources judiciously. In considering future development we need to take into account not only code architecture but also library functionality and the needs of our user base. Rewriting dompdf would not be a trivial matter (such undertakings never are). With some important rendering issues to address it doesn't yet make sense to start worrying about the code architecture. Especially when the code is compatible forwards as well as backwards.

That's not to say we don't want to make some significant changes in the future. And certainly the topic has come up multiple times. But changes of that nature are not yet warranted.

bsweeney avatar Sep 03 '13 21:09 bsweeney