MyST-Parser icon indicating copy to clipboard operation
MyST-Parser copied to clipboard

Fix smartquote logic

Open chrisjsewell opened this issue 3 years ago • 2 comments

I'm surprised no one has noticed this before, but I've just realised there are some "conflicts" in how smartquotes are handled; between myst (via markdown-it), docutils and sphinx.


Within myst-parser, enabling the "smartquotes" extension (e.g. via myst_enable_extensions = ["smartquotes"], enables the markdown-it smartquotes rule: https://github.com/executablebooks/MyST-Parser/blob/2b3a93129623dc6d2ae1254edd6d35b0b1917726/myst_parser/main.py#L252-L254 You can see its tested behaviour here: https://raw.githubusercontent.com/executablebooks/markdown-it-py/master/tests/test_port/fixtures/smartquotes.md, i.e. this is applied before converting to docutils AST


With docutils, the MyST parser is inheriting the docutils RST parser: https://github.com/executablebooks/MyST-Parser/blob/2b3a93129623dc6d2ae1254edd6d35b0b1917726/myst_parser/docutils_.py#L166 Because of this, it also currently inherits its transforms, for which RST adds the SmartQuote transform: https://github.com/chrisjsewell/docutils/blob/8adab0660b2097b4f3c32cef7e5ff4cb3c72b084/docutils/docutils/parsers/rst/init.py#L177-L179

This does approximately the same thing, and is turned off by default, so for example:

$ echo "\"a\"" | rst2pseudoxml.py
<document source="<stdin>">
    <paragraph>
        "a"
$ echo "\"a\"" | rst2pseudoxml.py --smart-quotes=yes
<document source="<stdin>">
    <paragraph>
        “a”
$ echo "\"a\"" | myst-docutils-pseudoxml 
<document source="<stdin>">
    <paragraph>
        "a"
$ echo "\"a\"" | myst-docutils-pseudoxml --smart-quotes=yes 
<document source="<stdin>">
    <paragraph>
        “a”
$ echo "\"a\"" | myst-docutils-pseudoxml --myst-enable-extensions=smartquotes
<document source="<stdin>">
    <paragraph>
        “a”

we could perhaps remove this transform, i.e.:

class Parser(RstParser):

    def get_transforms(self):
        transforms = super().get_transforms()
        transforms.remove(SmartQuotes)
        return transforms

For sphinx though, removing the SmartQuote transform would not matter, because it already does this: https://github.com/sphinx-doc/sphinx/blob/eed0730b4ba3bd2fbd34f2d6ab555ba876c77717/sphinx/parsers.py#L72-L80, in order to replace it with SphinxSmartQuotes: https://github.com/sphinx-doc/sphinx/blob/eed0730b4ba3bd2fbd34f2d6ab555ba876c77717/sphinx/transforms/init.py#L323

By contrast with docutils, sphinx smartquotes are on by default 😬 : https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-smartquotes, so even if you don't enable the myst-parser smartquotes extension, quotes will still be converted, unless you set smartquotes = False

There are ways of disabling, for instance by setting document.settings.smart_quotes = False in the parser.


Not decided on what the best behaviour here should be yet, I imagine though there should be only one "source of truth" for enabling/disabling smart quotes:

  1. Disable the docutils/sphinx smartquote transforms, and rely only on the markdown-it rule
  2. Remove the use of the markdown-it rule, and have myst_enable_extensions = ["smartquotes"] dynamically change the docutils/sphinx config in the parser
  3. Remove the myst-parser smartquotes extension entirely and instruct users to use the docutils/sphinx config options

Perhaps the SmartQuote transform is more "optimised" for docutils/sphinx use, but the pro of the markdown-it rule, is that it is not tightly coupled to docutils, so you can use it in other "myst implementations", e.g.: https://github.com/executablebooks/myst-vs-code/blob/c8c96542171ebe6ccdd896cf8337a5ffffeb73d4/src/extension.ts#L1

also related #424

cc @choldgraf, @mmcky, etc and anyone else's thoughts welcome

chrisjsewell avatar Dec 29 '21 17:12 chrisjsewell

thanks @chrisjsewell -- If I understand correctly smartquotes provides typographically correct quotations marks right? Given markdown-it is a tight dependency and does the bulk of the parsing from myst:md to sphinx.ast I don't see a big downside to using it for smartquotes. Perhaps the key issues is if a user provides sphinx configuration in _config.py then this may cause confusing results as it is rendered ineffective?

I see your issue re: source of truth here.

mmcky avatar Jan 03 '22 23:01 mmcky