MyST-Parser icon indicating copy to clipboard operation
MyST-Parser copied to clipboard

Don't be so anal about heading levels

Open infinity0 opened this issue 1 year ago • 5 comments

Describe the bug

$ cat lol.md
## test
$ pandoc lol.md
<h2 id="test">test</h2>

OTOH, MyST decides to throw a huge tantrum and generates the following:

<div class="document" id="test">
<h1 class="title">test</h1>

<div class="system-message">
<p class="system-message-title">System Message: WARNING/2 (<tt class="docutils">lol.md</tt>, line 1)</p>
Document headings start at H2, not H1 [myst.header]</div>
</div>

This breaks the use-case of using MyST for document fragments to be included into other documents or templating engines.

This actually works fine for regular reStructuredText docutils. Yes rst syntax is different and the heading levels are determined implicitly by the structure of the document, but exactly for this reason docutils provides config options for the user to override the initial header level. There is also another option to control whether the first heading is considered the "title" or not, doctitle_xform.

Markdown has explicit heading levels, and therefore does not need these two options. So, MyST can and should generate <h2> etc directly, just like pandoc-Markdown does.

Reproduce the bug

As described above, with the pandoc example.

List your environment

$ myst-docutils-html --version
myst-docutils-html (Docutils 0.19.1b.dev, Python 3.10.7, on linux)

infinity0 avatar Oct 31 '22 14:10 infinity0

Note, merely suppressing the warning doesn't work:

  1. it only works via sphinx, not when using docutils directly.
  2. the output continues to be <h1> not <h2>.

infinity0 avatar Oct 31 '22 14:10 infinity0

Dude, the title of this issue is not acceptable. But setting aside that, with the develop version of myst-parser:

myst-docutils-html lol.md --myst-suppress-warnings="myst.header" --no-doc-title --initial-header-level=2
...
<body>
<div class="document">
<div class="section" id="test">
<h2>test</h2>
</div>
</div>
</body>
</html>

chrisjsewell avatar Jan 12 '23 01:01 chrisjsewell

Markdown has explicit heading levels, and therefore does not need these two options.

It does though, this is the AST that is generated:

$ myst-docutils-pseudoxml lol.md --myst-suppress-warnings="myst.header" --no-doc-title 
<document source="test.md">
    <section ids="head" names="head">
        <title>
            head

there is no information captured about what the "original" heading level was.

you could add this here, e.g. <section level="2" ids="head" names="head">, but then you would also need to add code in docutils/sphinx to actually do something with this

chrisjsewell avatar Jan 12 '23 01:01 chrisjsewell

Basically, you would need to change https://github.com/live-clones/docutils/blob/18fa34d24994d629e3887152fe17b281e20ee907/docutils/docutils/writers/_html_base.py#L1610, to set the h<i> based on node["level"]

If this is something you want, then you need to open an issue/PR in docutils for it

chrisjsewell avatar Jan 12 '23 01:01 chrisjsewell

But setting aside that, with the develop version of myst-parser:

myst-docutils-html lol.md --myst-suppress-warnings="myst.header" --no-doc-title --initial-header-level=2
...
<body>
<div class="document">
<div class="section" id="test">
<h2>test</h2>
</div>
</div>
</body>
</html>

I am saying that MyST should set --initial-header-level and --no-doc-title automatically when it detects that the first heading is ## not #, since the Markdown format specification uses neither of these concepts.

infinity0 avatar Jan 18 '23 11:01 infinity0