commonmark-spec
commonmark-spec copied to clipboard
Clarify contents of indented code block for a line that does not have line ending
With this string as an input (note, no trailing newline):
" test"
Should the resulting code block node literal be "test" or "test\n"? commonmark.js (and others such as commonmark-java) currently return the latter.
The spec is not clear on this (from http://spec.commonmark.org/0.28/#indented-chunk):
The contents of the code block are the literal contents of the lines, including trailing line endings
Interesting test case! I think there should be no newline within the code tag, because the spec doesn't tell us to conjure up a newline out of thin air.
I guess this is a similar case:
"```\ntest"
The reference implementations ensure that the input ends with a newline character (adding one if needed).
Per https://github.com/pablohirafuji/elm-markdown/issues/6 , this behavior is a problem for our product because we need to parse and unparse the code often, and every cycle adds more and more \n to the markdown.
I'd argue that this newline before the closing ``` should not be part of the code block's .textContent because it is never useful:
- Users would not consider it part of the entered content
- HTML
<pre>renders the same with or without it, so it's superfluous
GitHub also does not render this newline in their HTML output so their render is (imho rightfully) in violation of CommonMark.
because it is never useful:
One should be careful with these kind of assessments. Code fences are used as verbatim environment to integrate foreign textual content into markdown for which final blank lines could be meaningful. I think the spec should rather treat the data "as is" and leave it to code fence processors do what they think is good (e.g. based on additional data in the info string).
For cross ref I also got significantly confused by this part of the spec and it could get some clarification (IIRC even the dingus and cmark do not agree) see this discussion.