commonmark-spec
commonmark-spec copied to clipboard
Character references in link definition labels
- Character references are allowed everywhere, except in fenced code, indented code, or code spans
- They represent their resolved character, not syntax
There’s even example 318 of having them in link definition destinations and link definition titles.
But, the following does not resolve into a link:
[©]: example.com
[©][]
I interpret the spec as saying that it should resolve, but then the dingus doesn’t. This may be a bug for the dingus implementation, rather than the spec.
I agree that, to meet author expectation or intuition, character references of all kinds should be normalized in link labels (and elsewhere), especially since letter case is being ignored. Unfortunately, only a single implementation, Maruku, does it this way, although most CM-conformant parsers (and Pandoc) will happily convert any HTML entities to plain characters on output.
One label matches another just in case their normalized forms are equal. To normalize a label, strip off the opening and closing brackets, perform the Unicode case fold, strip leading and trailing whitespace and collapse consecutive internal whitespace to a single space.
Note that matching is performed on normalized strings, not parsed inline content. So the following does not match, even though the labels define equivalent inline content:
Example 541
[bar][foo\!] [foo!]: /url
The rules for the link text are the same as with inline links.
An inline link […]
character references in the destination will be parsed into the corresponding Unicode code points, as usual.
character references are recognized in any context besides code spans or code blocks, including URLs, link titles, […]
link label […]
The contents of the first link label are parsed as inlines, which are used as the link’s text.
The link text may contain inline content: [Example 526]
This might be related to #572.
Btw, I think this should be true for character escapes too:
[©]: a.com
[\!]: b.com
Both should link: [©], [!]
Yields:
Both should link: [©], [!]
@jgm Is this something you agree with? I can create a PR to clarify the docs
I don't think this needs clarification of the docs so much as a bug report against CommonMark.js.
That said, every single Markdown implementation but one fails this test.