MyST-Parser icon indicating copy to clipboard operation
MyST-Parser copied to clipboard

Issues with sphinx `Locale` transform (assumes rST)

Open jpmckinney opened this issue 3 years ago • 15 comments

Screen Shot 2021-02-05 at 5 11 55 PM

Documentation repository: https://github.com/open-contracting/standard/pull/1197

To reproduce:

pip install -r requirements.txt
make

jpmckinney avatar Feb 05 '21 22:02 jpmckinney

Thanks for opening your first issue here! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.
If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).
Welcome to the EBP community! :tada:

welcome[bot] avatar Feb 05 '21 22:02 welcome[bot]

cc @choldgraf

jpmckinney avatar Feb 05 '21 22:02 jpmckinney

can you provide a minimal working of example of a file that causes this. (it may be related to https://github.com/executablebooks/MyST-Parser/issues/262)

chrisjsewell avatar Feb 05 '21 22:02 chrisjsewell

also does this only happen for a translated document?

chrisjsewell avatar Feb 05 '21 22:02 chrisjsewell

Yes, it only happens for a translated document. I'll try to work up an example. As you can see in the screenshot, the warning text is where the Spanish-language header would be otherwise.

jpmckinney avatar Feb 05 '21 23:02 jpmckinney

thanks, it looks to me that there is something weird going on with these translated texts, where the heading levels are messed up:

For example I don't see why it would think that: https://raw.githubusercontent.com/choldgraf/standard/myst-parser-switch/docs/404.md, has a level 2 (i.e. ##) level heading.

 /home/runner/work/standard/standard/docs/404.md:5:<translated>:1: WARNING: Non-consecutive header level increase; 0 to 2

Is there a way to view the source text of these special <translated> files?

chrisjsewell avatar Feb 05 '21 23:02 chrisjsewell

Here's a simple example: https://github.com/jpmckinney/myst-parser-tests

Instructions to reproduce are at: https://github.com/jpmckinney/myst-parser-tests/blob/master/index.md

Thanks for the quick responses!

jpmckinney avatar Feb 05 '21 23:02 jpmckinney

Note that, using the code in #303, the heading does appear in Spanish.

jpmckinney avatar Feb 05 '21 23:02 jpmckinney

Note that, using the code in #303, the heading does appear in Spanish.

indeed, but that is not addressing the root issue of the heading levels

I'll have to leave @choldgraf to look into this for now

(perhaps the translation is converting # to ## for some reason 🤷)

chrisjsewell avatar Feb 05 '21 23:02 chrisjsewell

So this is actually an upstream bug with the Locale transform: https://github.com/sphinx-doc/sphinx/blob/163c7bbdc12411818de1ce0204ff29cd911bcaf2/sphinx/transforms/i18n.py#L99, which in a number of places assumes that the parser is the reStructuredText one.

In particular here, for title nodes, it adds an - underline (https://github.com/sphinx-doc/sphinx/blob/163c7bbdc12411818de1ce0204ff29cd911bcaf2/sphinx/transforms/i18n.py#L266), which in Markdown just so happens to be a level two Setext header.

This leads to parsing e.g.:

Un título
------------------

to

<document source="/Users/chrisjsewell/Documents/GitHub/myst-parser-intl-tests/index.md:6:<translated>">
    <system_message level="2" line="1" source="/Users/chrisjsewell/Documents/GitHub/myst-parser-intl-tests/index.md:6:<translated>" type="WARNING">
        <paragraph>
            Non-consecutive header level increase; 0 to 2
    <section ids="un-titulo" names="un\ título">
        <title>
            Un título

then the first node is assumed to be the title => "Non-consecutive header level increase; 0 to 2" becoming the title.

translations in literal blocks also look to be an issue, because it pre-converts them from

a literal block

to

::

   a literal block

So yeh this requires a "fix" in the sphinx Locale code and/or overriding the transform in myst-parser (the first being ideal)

chrisjsewell avatar Feb 06 '21 01:02 chrisjsewell

Hmm, for my organization's project, we might need to do both, since even if there's a fix for Sphinx, it will only be available in the latest release (whenever that occurs).

@choldgraf Will you be having a look? This would be a blocker for using MyST. If we need to temporarily fork Sphinx or MyST-Parser, that's fine.

Update: We can use my branch here (which is a PR for Sphinx): https://github.com/jpmckinney/sphinx/tree/markdown-locale-heading-3.x

jpmckinney avatar Feb 08 '21 17:02 jpmckinney

I created https://github.com/sphinx-doc/sphinx/issues/8852, but I assume the issue will only be fixed in Sphinx>=4, which myst-parser doesn't presently support. Update: Nevermind, they're accepting non-breaking PRs for 3.x.

jpmckinney avatar Feb 08 '21 18:02 jpmckinney

Sphinx merged by PR in its 3.x branch to fix the heading level jump: https://github.com/sphinx-doc/sphinx/pull/8853

The maintainer suggests an approach in https://github.com/sphinx-doc/sphinx/issues/8852.

I think we can close the issue here, since it's a problem with upstream.

jpmckinney avatar Feb 13 '21 04:02 jpmckinney

ah sounds great - thanks for the update @jpmckinney

choldgraf avatar Feb 13 '21 20:02 choldgraf

I think we can close the issue here, since it's a problem with upstream.

I wouldn't close This though the nail the upstream issue has actually been closed: https://github.com/sphinx-doc/sphinx/issues/8852#issuecomment-778561279

chrisjsewell avatar Feb 14 '21 10:02 chrisjsewell