MyST-Parser icon indicating copy to clipboard operation
MyST-Parser copied to clipboard

Allow for autodoc to parse Markdown docstrings

Open chrisjsewell opened this issue 4 years ago • 49 comments

Originally posted by @asmeurer in https://github.com/executablebooks/MyST-Parser/issues/163#issuecomment-679917591

This issue will be of relevance here: https://github.com/sphinx-doc/sphinx/issues/8018

chrisjsewell avatar Aug 25 '20 09:08 chrisjsewell

There's also the question of numpydoc, which defines its own syntax for some things like parameters. Should myst use the same syntax, but just using Markdown markup in the text? Or should it use something more markdownic?

asmeurer avatar Aug 25 '20 19:08 asmeurer

I have just added definition list syntax rendering 😄 : see https://myst-parser.readthedocs.io/en/latest/using/syntax-optional.html#definition-lists

I think this could come in handy for an autodoc extension. Something like:

# Parameters

param1
: Description of param1

param2
: Description of param2

Thats maybe more markdownic?

chrisjsewell avatar Aug 25 '20 20:08 chrisjsewell

I don't know. There's also the Google docstring style, which is a little different (and preferred by many people). It would probably be a good idea to get broader community feedback on these things.

asmeurer avatar Aug 25 '20 20:08 asmeurer

It would probably be a good idea to get broader community feedback on these things.

Yep absolutely

But note, numpydoc and Google formats are both built around rST syntax. A markdown extension would use markdown-it-py to initially parse the docstring, and so any format has to be compatible with it in some fashion: utilising existing syntax plugins, or writing new ones.

chrisjsewell avatar Aug 25 '20 20:08 chrisjsewell

If it matters I'm working on decoupling parsing from rendering of docstring in IPython/Jupyter ; basically saying the if you can write a parser that goes from __doc__ to some well defined data structure with the right fields/info, then IPython (and by extension Jupyter) will know how to render it properly/nicely. (This could also pull some informations out of __signature__).

So, if the raw rendering to user In IPython/Jupyter is bothering you and influencing the syntax you are choosing, this will likely become less of an issue for users.

Carreau avatar Sep 08 '20 14:09 Carreau

Thanks @Carreau, I'll bear that in mind 😄

While you're here; I just added https://myst-parser.readthedocs.io/en/latest/using/syntax-optional.html#auto-generated-header-anchors, so that you can write e.g. [](path/to/doc.md#heading-anchor) and it will work correctly both directly on GitHub and building via sphinx.

These anchor slugs, I've found, are a bit changeable in their implementation across renderers, but generally they are converging to the GitHub "specification".

Jupyter Notebook/Lab seems to be a bit outdated in this respect (or at least the versions I tested)? They don't lower-case or remove punctuation, etc.

I'm surprised by this, because I thought they were both generally built around markedjs at the moment (please move to markdown-it 😉), which does implement this behaviour: https://github.com/styfle/marked/blob/a41d8f9aa69a4095aedae93c6e6ee5522a588217/lib/marked.js#L1991

chrisjsewell avatar Sep 09 '20 09:09 chrisjsewell

I'm very much interested in this feature as I've been using Markdown doc-strings for a while and would like to move from recommonmark to MyST.

By the way, it took me quite a while to get to this GitHub issue here. It would have helped if the section in the docs regarding the autodoc extension clearly stated that Markdown is not supported in doc-strings.

john-hen avatar Mar 30 '21 18:03 john-hen

That's a great point @John-Hennig - any interest in adding a PR to add a ```{warning} block there that also links to this issue in case folks want to give feedback?

choldgraf avatar Mar 30 '21 21:03 choldgraf

Originally posted by @asmeurer in #163 (comment)

This issue will be of relevance here: sphinx-doc/sphinx#8018

From the feedback autodoc issue it sounds like it might just be better to write a replacement for autodoc rather than trying to extend it?

dmwyatt avatar May 19 '21 18:05 dmwyatt

Here is a trick to have Markdown docstring with commonmark. I guess it could be done with myst_parser.

https://stackoverflow.com/questions/56062402/force-sphinx-to-interpret-markdown-in-python-docstrings-instead-of-restructuredt

Sphinx's Autodoc extension emits an event named autodoc-process-docstring every time it processes a doc-string. You can hook into that mechanism to convert the syntax from Markdown to reStructuredText.

import commonmark

def docstring(app, what, name, obj, options, lines):
    md  = '\n'.join(lines)
    ast = commonmark.Parser().parse(md)
    rst = commonmark.ReStructuredTextRenderer().render(ast)
    lines.clear()
    lines += rst.splitlines()

def setup(app):
    app.connect('autodoc-process-docstring', docstring)

oricou avatar May 25 '21 23:05 oricou

It's funny that you posted that as I made this comment on that a few hours ago.

dmwyatt avatar May 26 '21 01:05 dmwyatt

Here is a trick to have Markdown docstring with commonmark. I guess it could be done with myst_parser.

Yes, it does work with MyST. Since my earlier comment here, I have replaced Recommonmark with MyST in my projects and, as before, I'm using Commonmark.py to render the Markdown doc-strings. I've also updated my Stackoverflow answer to reflect that and mention MyST now that Recommonmark has been deprecated.

This works great for me, actually. But all I need in doc-strings is syntax highlighting of code examples. So nothing fancy. People who want advanced features such as math rendering, cross references, or possibly NumPy style, will have to wait for native doc-string support in MyST.

john-hen avatar May 26 '21 06:05 john-hen

@John-Hennig Great, could you share your code with MyST? TIA.

oricou avatar May 26 '21 07:05 oricou

Today I found https://github.com/mkdocstrings/mkdocstrings, is it related to the scope of this issue?

astrojuanlu avatar Aug 15 '21 08:08 astrojuanlu

@astrojuanlu mmmm probably not, because that seems to work with the mkdocs documentation engine, not Sphinx, no? Or is it usable for Sphinx as well?

choldgraf avatar Aug 15 '21 16:08 choldgraf

Right, it's based on MkDocs - I brought it up because it could inform the format of the docstring, regardless of the implementation.

astrojuanlu avatar Aug 16 '21 06:08 astrojuanlu

If anyone is motivated to tackle this, I would say an initial step would be to implement a https://docutils.sourceforge.io/docs/ref/rst/restructuredtext.html#field-lists plugin within https://github.com/executablebooks/mdit-py-plugins.

Using this, we could implement the classic doctring structure:

def func(a):
    """Function description.

    :param a: Parameter description, but with *Markdown* syntax
    """

chrisjsewell avatar Aug 16 '21 09:08 chrisjsewell

UPDATE:

With #455 implemented, it is now fully possible to use sphinx's python-domain directives in MyST 🎉 (see https://myst-parser.readthedocs.io/en/latest/syntax/optional.html#field-lists). For example, this will be properly parsed:

```{py:function} send_message(sender, priority)

Send a message to a recipient

:param str sender: The person sending the message
:return: the message id
:rtype: int
```

The sticking point now for autodoc (and similarly for https://github.com/readthedocs/sphinx-autoapi/issues/287) is that the auto directives first use Documenter sub-classes to generate source text (which is subsequently parsed), but the source text generation is currently hard-coded to RST (see https://github.com/sphinx-doc/sphinx/blob/edd14783f3cc6222066fd63efbe28c2728617e18/sphinx/ext/autodoc/init.py#L299)

For example,

```{autoclass} myst_parser.docutils_renderer.DocutilsRenderer
```

Is first converted to the text

.. py:class:: DocutilsRenderer(*args, **kwds)
   :module: myst_parser.docutils_renderer

   A markdown-it-py renderer to ...

which MyST cannot parse.

Primarily you just need to overwrite some aspects of these documenters, to handle converting to MyST, something like.

class MystFunctionDocumenter(FunctionDocumenter):
     def add_directive_header(self, sig: str) -> None:
         if parser_is_rst:
            super().add_directive_header(sig)
         if parser_is_myst:
             ...

then you load them via an extension:

def setup(app: Sphinx) -> Dict[str, Any]:
    app.add_autodocumenter(MystFunctionDocumenter)

this is certainly achievable.

One final thing (as noted https://github.com/sphinx-doc/sphinx/issues/8018#issuecomment-665727599), is that ideally you would be able to also switch the parser, based on if your docstrings were written in RST or Markdown, i.e. it would not matter whether you called autoclass from an RST or Markdown, it would always be parsed as Markdown.

chrisjsewell avatar Dec 06 '21 05:12 chrisjsewell

Converting the directive header may be fairly straightforward, but some of the domain directives will have a body with content that contains domain directives again. So these directives will be nested. It's quite a bit easier to do that in reST than it is in Markdown.

For example, let's say we have this module.py:

"""Doc-string of the module."""

class Class:
    """Doc-string of the class."""

    def method(self):
        """Doc-string of the method."""

We document is like so in index.rst:

.. automodule:: module
    :members:

And conf.py is simply:

extensions = ['sphinx.ext.autodoc']
import sys
sys.path.insert(0, '.')

When running sphinx-build . html -vv we see in the build log that Autodoc replaces the automodule directive with the following output:

.. py:module:: module

Doc-string of the module.


.. py:class:: Class()
   :module: module

   Doc-string of the class.


   .. py:method:: Class.method()
      :module: module

      Doc-string of the method.

It is already possible to render this with MyST:

```{py:module} module
```

Doc-string of the module.

````{py:class} Class()

Doc-string of the class.


```{py:method} Class.method()
:module: module

Doc-string of the method.
```
````

This produces the exact same HTML. But I had to put quadruple back-ticks at the outer scope to achieve the nesting. With reST, Autodoc just needs to increase the indentation level as it generates the body content of the directive line by line.

Maybe it's enough to just start with some extra back-ticks at the outer scope, for good measure. Nesting is usually not more than one level deep anyway. But the indentation also breaks the Markdown build. That's possibly an easy fix too, like override the content_indent attribute of the Documenter class. But Autodoc adds lines to the output in many different places, and often the indentation is just part of the string literal. That's where I gave up the last time I looked into this. I might give this another shot, but this could easily get quite complicated.

john-hen avatar Dec 06 '21 17:12 john-hen

Thanks for the feedback @john-hen

Note, another approach would be to override AutodocDirective, and add a line here: https://github.com/sphinx-doc/sphinx/blob/edd14783f3cc6222066fd63efbe28c2728617e18/sphinx/ext/autodoc/directive.py#L172, which uses https://github.com/executablebooks/rst-to-myst to converts the RST in params.result to MyST

I guess this may be simpler, with the con that it pulls in more dependencies

chrisjsewell avatar Dec 06 '21 17:12 chrisjsewell

I now have a working demo that uses MyST to parse the doc-strings:

I wrote two custom Sphinx extensions:

I did not implement the reST/MyST switch that you suggested, Chris (@chrisjsewell), as I was already struggling with the method resolution order of the derived classes. Though it should be possible. I did not try using rst-to-myst. Not saying that doesn't work, but given that we pull in doc-strings written in Markdown, it felt wrong to feed it input that isn't strictly reStructuredText.

I avoided regex substitutions as much as possible. That's often not robust and tends to turn into a series of hacks. The only exception is a helper function that Autodoc calls restify, which I wrapped with a function called mystify. (And people say naming things is hard. 😄 )

However, the solution I settled on doesn't exactly strike me as "clean" either. It replicates a lot of code from Autodoc. Some of the duplication could be avoided with monkey-patching, I guess, or messing with the method resolution order. And I think MyST would also tolerate much of the indentation needed in reST, so maybe some modifications aren't actually necessary. Point being, there could easily be a better way than this, that I didn't think of. Keep in mind that I've never written a Sphinx extension before.

I tested with that demo project as well as a larger, but still medium-sized project I maintain. That's still a small sample size. Autodoc has many features that were not used. Ultimately, I suppose, one would have to run essentially the same tests as for Autodoc and Autosummary, only with Markdown input. But I know next to nothing about Sphinx's test suite, so I left it at that. Also, Markdown containing nested code fences (using more than three back-ticks) should break the current solution. There's an easy fix for that, but there will always be a finite limit.

For comparison, I uploaded the same demo project written with reST as well as with MkDocs (checking out the competition, so to speak). The rendered docs are linked from the front page of the MyST demo build.

john-hen avatar Jan 01 '22 23:01 john-hen

This is great. I'm happy to see that MyST-style math, including equation references across modules, works here (use r"" strings otherwise \nabla becomes <newline>abla). I get spurious duplicate labels, but I think that's just because package.action overlaps package.actions and nothing to do with your work. We'll probably start using this in at least one project if you release it.

/home/jed/src/demo-MyST-docstring/docs/api/package.action.md:7: WARNING: duplicate label of equation bar, other instance in api/package.actions
/home/jed/src/demo-MyST-docstring/docs/api/package.actions.md:7: WARNING: duplicate label of equation bar, other instance in api/package.action

jedbrown avatar Jan 02 '22 00:01 jedbrown

I seem to be coming across an issue trying to render links in a docstring when using autodoc and having this in my markdown file

````{eval-rst}
.. automodule:: mymodule
   :members:
   :undoc-members:
   :show-inheritance:
````

A docstring (Google style) I'm using is something like

def my_function():
    """Header

    Some text with a `link`_.

    .. _link:
        https://github.com
    """
    pass

I get a warning when generating the docs

/path/to/mymodule/init.py:docstring of mymodule.my_function:12: ERROR: Unknown target name: "link".

The converted HTML has the following element set for this is

<a href="#id309"><span class="problematic" id="id310">`link`_</span></a>.</p>

The same function works just fine if I have an rst with the autodoc entry and is rendered outside of MyST. Unfortunately as mentioned in https://github.com/executablebooks/MyST-Parser/issues/519 the latest release seems to have changed how links are referenced so my other MyST generated docs that had the following now no longer work.

[Link To My Function](./source/mymodule.html#mymodule.my_function)

By using a markdown file to embed the autodoc entries these links now work, albiet in a slightly different way, but the actual docstring references in the function/classes are broken. Happy to try out anything as right now I'm stuck on the older version where both scenarios still work.

jborean93 avatar Feb 16 '22 04:02 jborean93

Heya @jborean93, https://github.com/executablebooks/MyST-Parser/issues/228#issuecomment-1041097220 is unrelated to the parsing of docstrings as Markdown, and should be opened as a separate issue. I seem to recall there being a similar issue already open, but couldn't find it on a quick search

chrisjsewell avatar Feb 16 '22 10:02 chrisjsewell

My apologies, I was going through the various issues and melded this with https://github.com/executablebooks/MyST-Parser/issues/163 but can see that is wrong. I'll do another scan through of the issues and open a new one if I can't find anything related.

jborean93 avatar Feb 16 '22 11:02 jborean93

With #455 implemented, it is now fully possible to use sphinx's python-domain directives in MyST 🎉 (see https://myst-parser.readthedocs.io/en/latest/syntax/optional.html#field-lists). For example, this will be properly parsed: ...

@chrisjsewell Thank you!

Both to Chris and anyone who is a stakeholder in this: There's multiple threads on this topic spanning across projects. I'm confused as to where things sit - and if there's a viable workaround or configuration we can paste in the mean time.

sphinx-autoapi + myst-parser

In re: https://github.com/executablebooks/MyST-Parser/issues/228#issuecomment-986447789 and https://github.com/readthedocs/sphinx-autoapi/issues/287#issuecomment-986448384

What's the status of sphinx-autoapi and myst-parser? Are there still steps needed in myst-parser, sphinx, etc? What's needed to get to the point where there's a functioning demo online w/ source code we can clone?

tony avatar Apr 14 '22 20:04 tony

@oricou Is this possible with MyST's Python API? I want to use GFM style tables in my docstrings, since that's the only format VSCode supports apparently.

demberto avatar Sep 12 '22 13:09 demberto

Ok so who wants to give it a go 😄 : https://sphinx-autodoc2.readthedocs.io/en/latest/quickstart.html#using-markdown-myst-docstrings

chrisjsewell avatar Feb 17 '23 17:02 chrisjsewell

Now integrated into the documentation 😄 https://myst-parser.readthedocs.io/en/latest/syntax/code_and_apis.html#documenting-whole-apis

chrisjsewell avatar Mar 01 '23 06:03 chrisjsewell

Are there examples of MyST docstrings in the wild? The ones I see in the docs borrow the :param style from reST, but I'd love to see more distinct MyST features being showcased.

Fantastic job everyone!

astrojuanlu avatar Mar 01 '23 07:03 astrojuanlu