MyST-Parser icon indicating copy to clipboard operation
MyST-Parser copied to clipboard

Myst handles HTML anchors in headings in an unexpected way

Open arwedus opened this issue 2 years ago • 8 comments

Describe the bug

context We have documents which have a structure very similar to the C++ Core Guidelines and use HTML anchors in heading the very same way as they do:

## <a name="S-const"></a>Con: Constants and immutability

The documents are from an external source and we cannot change them.

expectation

Linkage in github markdown viewer works:

* [Con: Constants and immutability](#S-const)
* the above produces a valid href to the section heading

Github also produces the "slugs" for a heading anchor, and ignores the HTML tags for it:

https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#con-constants-and-immutability

This is what I expect myst-parser to do as well.

Btw., both links work in full URLs, also the custom one: https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#S-const

bug

Instead, myst parser includes the HTML code in the heading anchors slug:

cpp_coding_guidelines.html#a-name-con-a-constants-and-immutability

The manually added anchor slugs also do not work and these links produce nasty warnings (which is related to #564, I assume):

cpp_coding_guidelines.md:2368: warning: 'myst' reference target not found: #Rf-single

problem

This is a problem because it breaks all links and spams lots of warnings during sphinx-build.

Reproduce the bug

  1. Download the cpp_core_guidelines.md and create a little sphinx project around it
  2. generate the HTML docs

List your environment

Sphinx==5.0.2
myst-parser==0.18.0
sphinx-rtd-theme==1.0.0
sphinx_design==0.2.0
mdit-py-plugins==0.3.0

arwedus avatar Jul 14 '22 10:07 arwedus

hey @arwedus ,have you seen https://myst-parser.readthedocs.io/en/latest/syntax/optional.html?highlight=anchor#auto-generated-header-anchors? This specifically creates GitHub style anchors

chrisjsewell avatar Aug 23 '22 09:08 chrisjsewell

@chrisjsewell I forgot to mention that I use this option ( myst_heading_anchors = 3 ) in the case described above.

arwedus avatar Aug 24 '22 20:08 arwedus

Instead, myst parser includes the HTML code in the heading anchors slug

Heya, yes the inclusion of the HTML code comes from: https://github.com/executablebooks/mdit-py-plugins/blob/855067ea167d90d5f66075ad124c206d9a1bf959/mdit_py_plugins/anchors/index.py#L82

Which, in turn, comes from: https://www.npmjs.com/package/markdown-it-anchor, "By default we include only text and code_inline tokens, which appeared to be a sensible approach for the vast majority of use cases."

But as you highlight, GitHub does not appear to be including these. So I guess we would want to "fix" this upstream, to do this It would be great if we could get the "official" code that GitHub uses to generate these, as I have never seemed to find it 🤔

chrisjsewell avatar Aug 28 '22 23:08 chrisjsewell

The manually added anchor slugs also do not work and these links produce nasty warnings

the problem with allowing these, is that they would only work for HTML output, but for output like LaTeX docutils/sphinx would not know how to resolve them

chrisjsewell avatar Aug 28 '22 23:08 chrisjsewell

You could of course replace:

[Con: Constants and immutability](#S-const)

with

<a class="reference internal" href="#S-const">Con: Constants and immutability</a>

if you want to fully bypass myst reference resolution

chrisjsewell avatar Aug 28 '22 23:08 chrisjsewell

So we're having a similar sort of issue, where we have docs that are built for GitHub rendering, and it seems to just not enjoy being pulled through Myst.

You can see an example of the issue here

Rendering w/ myst-parser 0.18.0 , I noticed two things:

  • the manual <a name="make_bucket"></a> doesn't get rendered as an anchor on the page at all - MyST seems to drop it
  • Setting the myst_heading_anchors here would normally work, except for the way we are using those headers (full method descriptions)

Possible answer here is to redo all our reference docs and not have the full method as part of the heading. Which...we can probably do.

But if MyST did render those anchors, then I think the linking might work.

As another issue, I cannot figure out how to suppress the reference warnings in the meantime - we now have hundreds of them. I tried adding


suppress_warnings = [ 
   'myst.reference',
   'myst.target',
   'myst.*',
   'myst'
]

But none of those seemed to take. Any ideas there?

ravindk89 avatar Sep 01 '22 19:09 ravindk89

As another issue, I cannot figure out how to suppress the reference warnings in the meantime - we now have hundreds of them.

For anyone looking for the warning suppress spell here: suppress_warnings = ['ref.myst'] seems to work.

lambdanis avatar Feb 16 '24 10:02 lambdanis

As another issue, I cannot figure out how to suppress the reference warnings in the meantime - we now have hundreds of them.

For anyone looking for the warning suppress spell here: suppress_warnings = ['ref.myst'] seems to work.

This is in the documentation 😄 https://myst-parser.readthedocs.io/en/latest/configuration.html#build-warnings

chrisjsewell avatar Feb 16 '24 12:02 chrisjsewell