comrak
comrak copied to clipboard
parsing of `**A*B*C*` doesn't match CM-dingus
comrak currently parses **A*B*C*
as <p>**A<em>B</em>C*</p>
the common-mark dingus gives the result <p>*<em>A<em>B</em>C</em></p>"
. (My implementation agrees with the dingus)
Here is the same piece of code to see how github renders it.
*ABC
Looking in the preview and pulling up the developer tools I get:
<p dir="auto">**A<em>B</em>C*</p>
which shows GitHubs behaviour matches comrak.
So I'm notsure what we'd want to do here...
After a careful reading of the spec I think that the dingus is correct.
- First we consider
**
it can't end an emph, as there are no earlier ones. - Next we consider the
*
inA*B
. It is both left-and-right flanking and so can start or end emph. We look backwards from there and find the starting**
- this can start emph, but we're not allowed to use it since the lengths of**
and*
add to 3. - there are no earlier entries so we move on. - Next we consider the
*
inB*C
- again both left and right flanking. Searching backward we hit the*
inA*B
- the sum of the lengths is not 3 so we can use it. This means we now have**A<em>B</em>C*
. We move on. - FInally we reach the ending
*
. It is only right-flanking - so it can only end emph. Searching backward we find the initial**
. The**
can only start emph and the final*
can only end emph, so the sum-to-3 issue does not occur - and they match, giving*<em>A<em>B</em>C</em>
.
I guess we might end up with comrak/GFMs behaviour if we considered that final *
to be able to both start and end emph - in which case the sum-to-3 rule would apply.
However, were that the case we should see the issue with the simpler **A*
- which comrak, the dingus and my code all parse as *<em>A</em>
.
Indeed, you are quite right: it looks like the spec always had this implication, but there was never an example that spelled it out. cmark upstream used to do this wrong, and so cmark-gfm (and thus Comrak) followed suit. cmark upstream addressed this bug in https://github.com/commonmark/cmark/commit/dc9366c1a9be4f6c6711556dc175b2583152acd6, and so it would be a similarly simple fix in Comrak.
A fix will be forthcoming — thanks so much!