md4c icon indicating copy to clipboard operation
md4c copied to clipboard

invalid HTML involving underscore/asterisk and link notation

Open step- opened this issue 1 year ago • 0 comments

Similar to #276 I found a case involving underscore, asterisk and link notation whereby md2html produces invalid HTML, because of an unbalanced <em> tag. The intention of all the examples below is not to trigger emphasis, but you can see that <em> sneaks in regardless.

Note that replacing * for _ in all the examples below produces the same invalid output.


With underscores only:

md2html --fstrikethrough << EOF
a_[](#a%20_) good

a__[](#a%20__) good

a_[](#a%20__) good

a_[](#a%20_/) good, punctuation doesn't matter

_[](_) unbalanced EM

_[](<_>) unbalanced EM

[_][_] balanced but unwanted EMs
EOF

Output (md2html 0.5.2):

<p>a_<a href="#a%20_"></a> good</p>
<p>a__<a href="#a%20__"></a> good</p>
<p>a_<a href="#a%20__"></a> good</p>
<p>a_<a href="#a%20_/"></a> good, punctuation doesn't matter</p>
<p><em><a href="_"></a> unbalanced EM</p>
<p><em><a href="_"></a> unbalanced EM</p>
<p>[<em>][</em>] balanced but unwanted EMs</p>

step- avatar Dec 01 '24 08:12 step-