Void phrasing elements (`br`, `embed`, `img`, `input` and `wbr`) not wrapped in `<p>` when standalone
HTML includes void elements (elements without closing tags). Among these, the behavior of phrasing content (inline elements) when written in raw HTML is undefined in CommonMark.
Void elements are documented at https://developer.mozilla.org/en-US/docs/Glossary/Void_element. Within this, the following four are phrasing content:
- br
- embed
- img
- input
- wbr
When these elements appear alone, Markdown should treat them as elements within a paragraph. However, the current reference implementation does not do this.
input
<img src="https:/example.com/image.png" alt="An image" title="Image title" />
<input type="text" name="name" value="value">
<br>
<wbr />
current actual output
<img src="https:/example.com/image.png" alt="An image" title="Image title" />
<input type="text" name="name" value="value">
<br>
<wbr />
expected output
<p><img src="https:/example.com/image.png" alt="An image" title="Image title" /></p>
<p><input type="text" name="name" value="value"></p>
<p><br></p>
<p><wbr /></p>
Why I think these should be wrapped in p elements
Currently, standalone Markdown image element or two or more consecutive raw <br> elements are wrapped in a <p> element. Therefore, for consistency, I believe these void elements should also be wrapped when they appear alone.
input

<br><br>
current output
<p><img src="https://example.com/img.png" alt="img.png"></p>
<p><br><br></p>
Please read https://spec.commonmark.org/0.31.2/#html-blocks
If these conditions are not met, the element will be considered inline and will be wrapped in p tags in HTML output.
Currently, standalone Markdown image element or two or more consecutive raw
<br>elements are wrapped in a<p>element. Therefore, for consistency, I believe these void elements should also be wrapped when they appear alone.
The above section of the spec will explain why. The spec is designed to give flexibility to the writer to determine whether something is passed through as a raw HTML block or considered inline HTML elements.
Thank you. I understood that this case falls under 7..
However, since void elements are assumed to have no closing tags in the HTML specification, I believe that the CommonMark specification should also define a behavior that treats them specially.
https://developer.mozilla.org/en-US/docs/Glossary/Void_element
For example, <input> does not mean that the block or tag is open. It is open and closed at the same time. So <input> by itself is equivalent to <input></input> in meaning. (However, this is incorrect for HTML.) When <input></input> is given to CommonMark, it is enclosed in <p>.
Have you tried the form <input />?
I guess that according to https://spec.commonmark.org/0.31.2/#open-tag <input /> falls under the definition of an "open tag" and would be treated the same as <input>.
Perhaps you are right that the spec should have a notion of void element and give them special treatment. But, in order to do what you are asking, we'd also have to track which void elements can only occur as phrasing elements. The spec around raw HTML would become considerably more complex. If you want to propose a specific change, we could consider it.
But I can also propose some easy workarounds:
<input ...><!-- -->
or
<p><input ...></p>
I believe it would suffice to add a specification stating that these elements should not be treated as block elements during the processing of HTML block 7, as follows.
https://github.com/Songmu/commonmark-spec/commit/cb38575b8717cf12e4c628cc8d46a0b809884f47
As discussed previously, it's odd that these are treated as block elements. Adding this specification alone will ensure these elements are correctly interpreted as raw HTML within paragraphs.
Adding this specification will maintain compatibility in the test suite. Subsequent implementation adjustments and additional test cases can follow.
Reason for the 5 Elements
The elements targeted this time are br, embed, img, input, and wbr.
Reading the HTML specification, these are the only 5 out of the 13 void elements that are also classified as phrasing content without special conditions. (area, link, and meta are excluded because they are classified as phrasing content only under specific conditions.)
ref:
- https://html.spec.whatwg.org/multipage/dom.html#phrasing-content
- https://html.spec.whatwg.org/multipage/syntax.html#void-elements
While more might be added in the future, based on past changes, it's unlikely that we need to worry about it too much. If we add specifications for these now, the likelihood of further modifications being made later seems low.
In the real world, it's rare for the wbr element to appear alone on a line, but for the sake of specification consistency, this seems acceptable.
Thank you for discussing this. I've filed this change as a pull request. https://github.com/commonmark/commonmark-spec/pull/810