pandoc icon indicating copy to clipboard operation
pandoc copied to clipboard

Org-mode reader: `#+pandoc-emphasis-pre` doesn't work as expected

Open adql opened this issue 3 years ago • 4 comments

Explain the problem. Adding characters to #+pandoc-emphasis-pre as described in the manual doesn't work as expected. Interestingly, adding to #+pandoc-emphasis-post does.

Minimal working example test.org:

#+pandoc-emphasis-pre: "-\t ('\"{T"
#+pandoc-emphasis-post: "-\t\n .,:!?;'\")}[t"

1. T/est/ with T allowed as pre
2. /Tes/t with t allowed as post
3. Normal /emphasis/, and in {/brackets/}

Command: pandoc -o test.md test.org

Expected test.md result:

1.  T*est* with T allowed as pre
2.  *Tes*t with t allowed as post
3.  Normal *emphasis*, and in {*brackets*}

Actual result:

1.  T/est/ with T allowed as pre
2.  *Tes*t with t allowed as post
3.  Normal *emphasis*, and in {*brackets*}

Exporting to Pandoc AST confirms that the problem is with the reader.

Also completely replacing the strings with "T" and "t" respectively achieves the same result.

Pandoc version? Pandoc 2.18 on Manjaro Linux (pandoc-2.18-linux-amd64.tar.gz from the release page). Also happens on https://pandoc.org/try

adql avatar May 07 '22 16:05 adql

The relevant test in tests/Tests/Readers/Org/Meta.hs

[ "Changing pre and post chars for emphasis" =:
  T.unlines [ "#+pandoc-emphasis-pre: \"[)\""
              , "#+pandoc-emphasis-post: \"]\\n\""
              , "([/emph/])*foo*"
              ] =?>
  para ("([" <> emph "emph" <> "])" <> strong "foo")

, which tests adding the non-standard [ to pre, passes flawlessly. Possibly the bug occurs on alphanumeric chars (?) – I tried manipulating the test with T, t, and 3, all fail.

adql avatar Jun 18 '22 19:06 adql

I tried more cases and it seems like this is more general that I first thought. The only characters I managed so far to have as pre are various parentheses, $ and +. Alphanumeric chars, !, %, # all fail. Probably others as well.

adql avatar Jun 20 '22 10:06 adql

Related issue: #6070

tarleb avatar Jun 20 '22 10:06 tarleb

Tracing for orgStateEmphasisPreChars in the parser state shows that it updates as expected after #+pandoc-emphasis-pre:, i.e. also with chars that fail to become allowed before emphasis. The problem must be in the parsing of the emphasis itself.

adql avatar Jun 23 '22 11:06 adql