racket-commonmark icon indicating copy to clipboard operation
racket-commonmark copied to clipboard

Links between (inline) `html` elements not parsed consistently

Open otherjoel opened this issue 2 years ago • 0 comments

The program

#lang at-exp racket/base

(require commonmark
         racket/format)

@(string->document @~a{
(1st) This <span>is _emphasis_ and [a link](http://example.com).</span>
                       
(2nd) This <span>is _emphasis_ and [a link][1].</span>

(3rd) This <span>is _emphasis_ and [a link](http://example.com).

(4th) This <span>is _emphasis_ and [a link](http://example.com).</div>

[1]: http://example.com})

produces

(document
 (list
  (paragraph (list "(1st) This " (html "<span>") "is " (italic "emphasis") " and [a link](http://example.com)." (html "</span>")))
  (paragraph (list "(2nd) This " (html "<span>") "is " (italic "emphasis") " and " (link "a link" "http://example.com" #f) "." (html "</span>")))
  (paragraph (list "(3rd) This " (html "<span>") "is " (italic "emphasis") " and " (link "a link" "http://example.com" #f) "."))
  (paragraph (list "(4th) This " (html "<span>") "is " (italic "emphasis") " and [a link](http://example.com)." (html "</div>"))))
 '())

Respectively:

  1. Italics are parsed, but the link is not.
  2. Changing the link to use reference notation causes it to be parsed properly.
  3. Keeping the inline URL on the link but removing the closing </span> causes it to be parsed properly.
  4. Like the 1st example but with mismatched HTML tags. Link is still not parsed.

I would naively expect that the link would be parsed correctly in all four examples.

The CommonMark spec sections for inline HTML elements don’t seem to address the parsing of Markdown inside inline-presenting HTML. But for what it’s worth, the reference implementation does parse all four examples in the way I would expect:

screenshot of the commonmark reference dingus showing link parsing correctly in all four examples

otherjoel avatar Sep 23 '23 16:09 otherjoel