commonmark-spec icon indicating copy to clipboard operation
commonmark-spec copied to clipboard

Clarification of line endings in code blocks

Open andrewbranch opened this issue 5 years ago • 2 comments

The spec says of indented code blocks:

The contents of the code block are the literal contents of the lines, including trailing line endings minus four spaces of indentation.

And it doesn’t mention line endings at all for fenced code blocks. When comparing this limited description to the examples listed in the spec and to the commonmark.js implementation, there’s an apparent inconsistency: commonmark.js seems to normalize the line endings from code block contents rather than include them as written. As a simple proof, you can go to https://spec.commonmark.org/dingus/, open the web developer tools, and evaluate in the console:

new commonmark.Parser().parse('    this line ends in CRLF\r\n    as does this line\r\n').firstChild.literal.includes('\r')
// false

The same behavior occurs for fenced code blocks too. I’m unsure whether this is a bug in the spec or in commonmark.js, but the current spec text, at least for indented code blocks, seems to imply that the line endings should come unmodified from the source.

Anyone know which is correct?

andrewbranch avatar Mar 28 '20 20:03 andrewbranch

While I understand the inconsistency (it is indeed not documented in CM), does this result in an actual bug for you?

For what it’s worth, the HTML spec does the same (albeit specified): \r and \r\n are changed to \n.

wooorm avatar Jul 04 '20 17:07 wooorm

Not really; at the time I was working on a Commonmark-compliant parser, and so wanted to get clarification for the purpose of my own implementation.

andrewbranch avatar Jul 05 '20 23:07 andrewbranch