Markup stripping in reference links
The JavaScript parser treats reference links which have markup differently. Namely, it strips the markup backtick characters.
A [`link`][] reference with backticks.
[`link`]: https://example.org
Generates a warning (Reference "link" not found) and outputs
<p>A <a><code>link</code></a> reference with backticks.</p>
Writing [link]: https://example.org works correctly.
This is in contrast with the Pandoc parser, which only recognizes [`link`].
Actually, after checking, I saw that both djot.lua and jotdown have the same behavior as djot.js. So, the question is: which one is correct?
It seems that the intention is to strip the text within the brackets when producing the tag to search for a definition with. I don't see it specified in the syntax reference but there is e.g. a test that verifies it in test/links_and_images.test:
[link _and_ link][]
[link and link]: url
.
<p><a href="url">link <em>and</em> link</a></p>
Presumably the intention is that you shouldn't have to copy the markup in the link definition tag, it could in theory be quite complex.
I'm guessing this was missed in the pandoc implementation, though.
At minimum, the behavior we want needs to be specified in the spec.
I think that if this causes an error, that's a problem with the implementation:
A [`link`][] reference with backticks.
[`link`]: https://example.org
That should just work.
At the moment, I don't recall why I implemented it the way I did in JS. The Haskell implementation takes the label to be the raw text between [ and ].
It should be reasonably straightforward to modify the reference link parser at https://github.com/jgm/djot.js/blob/main/src/inline.ts#L349-L376 so that, instead of converting existing matches to Str elements, it replaces them all with a new match with the raw text between brackets, taken from subject.
Yeah, having it defined in the specification would be ideal.
I think the main question is, should [`link`][] accept [link]: ... as a definition? Rust docs allow specifying a link to an object with both [`Ident`] and [Ident], but I'm unsure if they are leveraging Markdown compilation or just manually add all possible references.
And having both variants be valid could further complicate the spec. It'd be kind of similar to case-insensitive matches Markdown has, which were ditched because of the complexity