commonmark-spec
commonmark-spec copied to clipboard
Clarification of line endings in code blocks
The spec says of indented code blocks:
The contents of the code block are the literal contents of the lines, including trailing line endings minus four spaces of indentation.
And it doesn’t mention line endings at all for fenced code blocks. When comparing this limited description to the examples listed in the spec and to the commonmark.js implementation, there’s an apparent inconsistency: commonmark.js seems to normalize the line endings from code block contents rather than include them as written. As a simple proof, you can go to https://spec.commonmark.org/dingus/, open the web developer tools, and evaluate in the console:
new commonmark.Parser().parse(' this line ends in CRLF\r\n as does this line\r\n').firstChild.literal.includes('\r')
// false
The same behavior occurs for fenced code blocks too. I’m unsure whether this is a bug in the spec or in commonmark.js, but the current spec text, at least for indented code blocks, seems to imply that the line endings should come unmodified from the source.
Anyone know which is correct?
While I understand the inconsistency (it is indeed not documented in CM), does this result in an actual bug for you?
For what it’s worth, the HTML spec does the same (albeit specified): \r and \r\n are changed to \n.
Not really; at the time I was working on a Commonmark-compliant parser, and so wanted to get clarification for the purpose of my own implementation.