commonmark-hs icon indicating copy to clipboard operation
commonmark-hs copied to clipboard

Absolute value pipes in latex formula seem to trip up list parser

Open maralorn opened this issue 4 years ago • 7 comments

I use commonmark 54ad60d (via latest neuron). This issue might be a bug in neuron, so please direkt me there, but my best guess this is problem with the commonmark parser, so I am starting here.

When my markdown looks like this:

Examples:

* $|x| = 2$
* other example

the parser does not seem to recognize the first line as a list element. The result looks like this:

If I remove the pipe character | it works, as seen here:

Please let me know, if you need more infos to reproduce this. Also if there is some easy way to test this example with commonmark-hs without neuron just tell me and I'll gladly check if it reproduces there.

\cc @srid

maralorn avatar Jul 17 '20 09:07 maralorn

I couldn't reproduce this on https://imalsogreg.github.com/commonmark-editor 2020-07-17-093922_795x805_scrot

fwiw. I can try later reproducing it with the commonmark CLI

imalsogreg avatar Jul 17 '20 13:07 imalsogreg

Are you enabling the pipe tables extension?

jgm avatar Jul 18 '20 04:07 jgm

@srid pointed me to this list. And yes I see the pipetables in there. https://github.com/srid/neuron/blob/a70a532f06d54fdc13510a711a110c3fe41f9e02/neuron/src/lib/Neuron/Reader/Markdown.hs#L107-L125

maralorn avatar Jul 18 '20 09:07 maralorn

The issue can be reproduced with the command line tool thus:

% commonmark -xmath -xpipe_tables
Examples:

* $|x| = 2$
* other example
^D
<p>Examples:</p>
<p>* <span class="math inline">\(|x| = 2\)</span></p>
<ul>
<li>other example
</li>
</ul>

I understand what is happening here. The parser attempts to recognize the first bulleted item as the beginning of a pipe table (because of the | characters). When this fails, it recategorizes the node as a paragraph; at this point it is too late to recognize it as a list.

Something similar happens with:

% commonmark -xmath -xpipe_tables -xfenced_divs
::: {foo="hi|there|ok"}
hi
:::
^D
<p>::: {foo=&quot;hi|there|ok&quot;}
hi
:::</p>

Here we should get a fenced div. It may be that we need to allow some limited backtracking in this case.

jgm avatar Jul 19 '20 17:07 jgm

Even simpler example:

% commonmark -xpipe_tables          
# Hi | there | ok
^D
<p># Hi | there | ok</p>

jgm avatar Jul 19 '20 17:07 jgm

Another option would be to do some kind of lookahead for the separator lines. Current architecture, though, is for line-by-line parsing.

jgm avatar Jul 19 '20 17:07 jgm

Simple workaround now would be to make sure that the pipe_tables syntax is AFTER the list syntax.

defaultSyntaxSpec <> pipeTableSpec

jgm avatar Jul 19 '20 18:07 jgm