MyST-Parser
MyST-Parser copied to clipboard
Fix smartquote logic
I'm surprised no one has noticed this before, but I've just realised there are some "conflicts" in how smartquotes are handled; between myst (via markdown-it), docutils and sphinx.
Within myst-parser, enabling the "smartquotes" extension (e.g. via myst_enable_extensions = ["smartquotes"]
, enables the markdown-it smartquotes rule: https://github.com/executablebooks/MyST-Parser/blob/2b3a93129623dc6d2ae1254edd6d35b0b1917726/myst_parser/main.py#L252-L254
You can see its tested behaviour here: https://raw.githubusercontent.com/executablebooks/markdown-it-py/master/tests/test_port/fixtures/smartquotes.md,
i.e. this is applied before converting to docutils AST
With docutils, the MyST parser is inheriting the docutils RST parser: https://github.com/executablebooks/MyST-Parser/blob/2b3a93129623dc6d2ae1254edd6d35b0b1917726/myst_parser/docutils_.py#L166
Because of this, it also currently inherits its transforms, for which RST adds the SmartQuote
transform: https://github.com/chrisjsewell/docutils/blob/8adab0660b2097b4f3c32cef7e5ff4cb3c72b084/docutils/docutils/parsers/rst/init.py#L177-L179
This does approximately the same thing, and is turned off by default, so for example:
$ echo "\"a\"" | rst2pseudoxml.py
<document source="<stdin>">
<paragraph>
"a"
$ echo "\"a\"" | rst2pseudoxml.py --smart-quotes=yes
<document source="<stdin>">
<paragraph>
“a”
$ echo "\"a\"" | myst-docutils-pseudoxml
<document source="<stdin>">
<paragraph>
"a"
$ echo "\"a\"" | myst-docutils-pseudoxml --smart-quotes=yes
<document source="<stdin>">
<paragraph>
“a”
$ echo "\"a\"" | myst-docutils-pseudoxml --myst-enable-extensions=smartquotes
<document source="<stdin>">
<paragraph>
“a”
we could perhaps remove this transform, i.e.:
class Parser(RstParser):
def get_transforms(self):
transforms = super().get_transforms()
transforms.remove(SmartQuotes)
return transforms
For sphinx though, removing the SmartQuote
transform would not matter, because it already does this: https://github.com/sphinx-doc/sphinx/blob/eed0730b4ba3bd2fbd34f2d6ab555ba876c77717/sphinx/parsers.py#L72-L80,
in order to replace it with SphinxSmartQuotes
: https://github.com/sphinx-doc/sphinx/blob/eed0730b4ba3bd2fbd34f2d6ab555ba876c77717/sphinx/transforms/init.py#L323
By contrast with docutils, sphinx smartquotes are on by default 😬 : https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-smartquotes,
so even if you don't enable the myst-parser smartquotes extension, quotes will still be converted, unless you set smartquotes = False
There are ways of disabling, for instance by setting document.settings.smart_quotes = False
in the parser.
Not decided on what the best behaviour here should be yet, I imagine though there should be only one "source of truth" for enabling/disabling smart quotes:
- Disable the docutils/sphinx smartquote transforms, and rely only on the markdown-it rule
- Remove the use of the markdown-it rule, and have
myst_enable_extensions = ["smartquotes"]
dynamically change the docutils/sphinx config in the parser - Remove the myst-parser smartquotes extension entirely and instruct users to use the docutils/sphinx config options
Perhaps the SmartQuote
transform is more "optimised" for docutils/sphinx use, but the pro of the markdown-it rule, is that it is not tightly coupled to docutils, so you can use it in other "myst implementations", e.g.: https://github.com/executablebooks/myst-vs-code/blob/c8c96542171ebe6ccdd896cf8337a5ffffeb73d4/src/extension.ts#L1
also related #424
cc @choldgraf, @mmcky, etc and anyone else's thoughts welcome
thanks @chrisjsewell -- If I understand correctly smartquotes
provides typographically correct quotations marks right? Given markdown-it
is a tight dependency and does the bulk of the parsing from myst:md
to sphinx.ast
I don't see a big downside to using it for smartquotes
. Perhaps the key issues is if a user provides sphinx
configuration in _config.py
then this may cause confusing results as it is rendered ineffective?
I see your issue re: source of truth
here.