nickel
nickel copied to clipboard
The markdown `nickel doc` output prints some spurious backslashes
Describe the bug
nickel doc's markdown output contains some backlashes that mess up with the rendering. For instance, { x : Number = 1, y | doc "Blah" } will be rendered as
# `x`
- `x : Number`
\
# `y`
\
Blah
cmark as well as GitHub's markdown renderer and pandoc --from=commonmark will show (some of) these backslashes literally:
Pandoc's own markdown format won't render them but interpret them as escaping the start of the next line:
To Reproduce
$ nickel doc --stdout <<<'{ x : Number = 1, y | doc "Blah" }' | cmark
Expected behavior
An output that is at least commonmark-compatible (and if possible compatible with other common markdown dialects like gfm or pandoc)
Environment
- OS name + version:
- Version of the code: 28672ee45bfe78dc3dc828ac1fe4fa40f99965e9
Additional context Add any other context about the problem here.
We don't generate markdown by hand, but we build an AST and use a crate called comrak, and specifically use the format_commonmark function. It seems that a backslash followed by a new line is indeed part of the CommonMark specification, which represents a hard line break (cf hard line breaks).
However, I think the issue is that those lines can't appear at the end of a block (put differently, they can, but won't be interpreted as a hard line break), but comrak seems to put them here nonetheless. So it sounds like a comrak bug.
I tried to update to latest comrak to see if by any chance that would solve the issue, but it's not backward compatible, so it requires to fix the code first. Will keep this issue updated
I had a quick poke at this because it seemed like it would be easy... updating comrak didn't help, and I found that comrak and cmark-gfm have different behavior here (cmark-gfm renders the line breaks as "space-space-newline" instead of "backslash-newline"). I reached out to comrak to see if they'd change it.
I didn't file an issue on comrak because we weren't using the latest version, but I really think this is a bug on their end. The commonmark standard says clearly that a backslash is interpreted as a hard line break if it doesn't appear at the end of a block (which it does in comrak), so it's expected that compliant renderers show this backslash unmodified.
Incidentally, I think updating to latest comark is valuable even if it doesn't solve the current problem, so feel free to push those changes @jneem :slightly_smiling_face:
On the other hand, I think it's in fact not possible to have a line break at the end of a code block in markdown. That is, the AST we're producing is potentially impossible to produce by parsing valid markdown. So maybe we just want to check if we are at the end of a block before adding a line break?
Yeah, so I emailed the comrak maintainer directly (because I couldn't comment on the repo linked from crates.io and didn't realize there was a github mirror :facepalm:). The summary is that this behavior was introduced here and they'd be willing to go back to the old behavior as long as it doesn't regress this. I'll have a go at doing that.
So maybe we just want to check if we are at the end of a block before adding a line break?
I tried this, but it doesn't seem super easy because it depends on what the doc is. NodeValue::block always reports true for the doc's ast node, but if it's a Paragraph then rendering and re-parsing merges it into the previous paragraph. I think the markdown ast was just not designed to round-trip losslessly...
The thing is that I'm now convinced the root of the problem is around comrak rendering of lists. What we did was a hacky work-around, but we shouldn't have to do that at all: we're correctly producing list followed by a different block, but for some reason it's not rendered with the expected new line in between.
After fiddling with the code, it doesn't happen when you use e.g. the comrak executable to parse markdown or even just Nickel directly by stuffing that markdown into a doc block:
{
foo | doc m%"
- `foo | Number`
rest"%
}
will correctly render in the doc as
- `foo | Number`
rest