html5-php icon indicating copy to clipboard operation
html5-php copied to clipboard

TEXT_RCDATA Fields and Processing Instructions

Open asabosch opened this issue 9 years ago • 1 comments

In TEXT_RCDATA fields like <title> it is not possible to use processing instructions.

Could this be the sole exception for RCDATA fields or is this against the spec?

asabosch avatar Nov 07 '15 18:11 asabosch

Any tag-like thing inside of an RCDATA field has to be shown as-is. Here's the test case example: https://github.com/Masterminds/html5-php/blob/2.x/test/HTML5/Parser/TokenizerTest.php#L894

The spec is a little hard to read on this point because you have to trace through all of the parser rules, but this part sort of shows what is supposed to happen when you hit a tag-like thing: http://www.w3.org/html/wg/drafts/html/master/syntax.html#rcdata-end-tag-name-state

But you've hit on an interesting case: HTML5 does not define (well, really, it does not allow) processing instructions. The prevailing wisdom has been to pre-process them out of the HTML.

I personally would not object to amending HTML5-PHP to allow processing instructions inside of RCDATA. I say this because the goal of the project has for a long time been to support compatibility with existing implementations, not to follow the exact letter of the spec. Since PIs are peppered throughout server-side usage of HTML, I'd be fine with ignoring the precise restraints of the standard and allowing PIs inside of RCDATA.

So if someone submits a patch, I'd support it.

technosophos avatar Nov 07 '15 19:11 technosophos