commonmark-spec icon indicating copy to clipboard operation
commonmark-spec copied to clipboard

(Intentional?) inconsistency between 4.6 block HTML and 6.6 raw HTML comments

Open wooorm opened this issue 3 years ago • 5 comments

The block HTML algorithm here allows <!-->, <!--->, etc, as comments. These comments are also fine by the HTML parser (13.2.5.44, case for U+002D). (note there are a couple of cases such as <!> and <!-> which HTML also allows but sees as parse errors, I am not talking about these).

The “inline” algorithm here does not allow <!--> or <!--->. They look a lot like comments, so I don’t really expect people to depend on these characters to be text. And it’s inconsistent with blocks. Can we change the spec to allow them?

I can do the work

wooorm avatar Jun 14 '22 09:06 wooorm

Yes, I'm in favor.

jgm avatar Jun 14 '22 17:06 jgm

Good to hear! One thing that I was wondering: -- in a comment is the same. For example, <!-- some stuff -- some more stuff -->. OK too?

wooorm avatar Jun 14 '22 17:06 wooorm

If I recall, we deliberately simplified the comment parsing (even though this diverts from HTML standard). I don't remember why, though. I'm okay with implementing something more standard as long as it doesn't increase complexity too much, both in the spec and in parsers.

jgm avatar Jun 14 '22 17:06 jgm

I wouldn’t know why that was the case! Perhaps if you care more about XML than HTML?

In my case, this just removes states in my state machine that are needed for inline, but not for block. I can see -- in comments being used by humans, so that might even be considered a bug fix.

wooorm avatar Jun 14 '22 17:06 wooorm

For reference, the HTML5 spec for comments: https://html.spec.whatwg.org/multipage/syntax.html#comments

jgm avatar Jun 14 '22 18:06 jgm

Thanks for merging this, John!

wooorm avatar Sep 08 '22 16:09 wooorm

Reopening until we get the issue of <!--> and <!---> (not to mention <!-- hi -->) sorted out. See comments on linked PR.

jgm avatar Sep 08 '22 18:09 jgm

I think an inconsistency between the block and inline cases is okay, given that the spec for block HTML allows invalid HTML.

jgm avatar Sep 08 '22 18:09 jgm

However, allowing -- inside HTML comments is a change worth making.

jgm avatar Sep 08 '22 18:09 jgm

commented in the PR: https://github.com/commonmark/commonmark-spec/pull/713#issuecomment-1241059993.

wooorm avatar Sep 08 '22 18:09 wooorm