mistletoe icon indicating copy to clipboard operation
mistletoe copied to clipboard

Normalising Line-Endings

Open href opened this issue 6 years ago • 1 comments

When \r\n is used for line endings, the behaviour can be a bit surprising:

>>> from mistletoe import Document, HTMLRenderer

>>> HTMLRenderer().render(Document("foo  \r\nbar"))
'<p>foo  \r\nbar</p>\n'

>>> HTMLRenderer().render(Document("foo  \nbar"))
'<p>foo<br />\nbar</p>\n'

I stumbled on this today and I thought I'd report it. Not sure what the correct behaviour should be though. I only found it mentioned here: https://talk.commonmark.org/t/carriage-returns-and-code-blocks/2519

href avatar Sep 20 '18 08:09 href

I can't find any authoritative answers from CommonMark either. I used to replace \r\n line endings with \n in Document.__init__, but stopped doing so after I discovered that Python's open function automatically normalizes line endings. See the documentation on open's newline parameter.

Is there a reason you would want to pass in raw strings with CRLF line-endings to mistletoe? It is an easy fix--I just need to add one line in Document.__init__, but it will incur some performance cost where the normalization is otherwise not needed.

miyuchina avatar Jun 08 '19 19:06 miyuchina