Org-mode reader: `#+pandoc-emphasis-pre` doesn't work as expected
Explain the problem.
Adding characters to #+pandoc-emphasis-pre as described in the manual doesn't work as expected. Interestingly, adding to #+pandoc-emphasis-post does.
Minimal working example test.org:
#+pandoc-emphasis-pre: "-\t ('\"{T"
#+pandoc-emphasis-post: "-\t\n .,:!?;'\")}[t"
1. T/est/ with T allowed as pre
2. /Tes/t with t allowed as post
3. Normal /emphasis/, and in {/brackets/}
Command: pandoc -o test.md test.org
Expected test.md result:
1. T*est* with T allowed as pre
2. *Tes*t with t allowed as post
3. Normal *emphasis*, and in {*brackets*}
Actual result:
1. T/est/ with T allowed as pre
2. *Tes*t with t allowed as post
3. Normal *emphasis*, and in {*brackets*}
Exporting to Pandoc AST confirms that the problem is with the reader.
Also completely replacing the strings with "T" and "t" respectively achieves the same result.
Pandoc version?
Pandoc 2.18 on Manjaro Linux (pandoc-2.18-linux-amd64.tar.gz from the release page). Also happens on https://pandoc.org/try
The relevant test in tests/Tests/Readers/Org/Meta.hs
[ "Changing pre and post chars for emphasis" =:
T.unlines [ "#+pandoc-emphasis-pre: \"[)\""
, "#+pandoc-emphasis-post: \"]\\n\""
, "([/emph/])*foo*"
] =?>
para ("([" <> emph "emph" <> "])" <> strong "foo")
, which tests adding the non-standard [ to pre, passes flawlessly. Possibly the bug occurs on alphanumeric chars (?) – I tried manipulating the test with T, t, and 3, all fail.
I tried more cases and it seems like this is more general that I first thought. The only characters I managed so far to have as pre are various parentheses, $ and +. Alphanumeric chars, !, %, # all fail. Probably others as well.
Related issue: #6070
Tracing for orgStateEmphasisPreChars in the parser state shows that it updates as expected after #+pandoc-emphasis-pre:, i.e. also with chars that fail to become allowed before emphasis. The problem must be in the parsing of the emphasis itself.