docstring_parser icon indicating copy to clipboard operation
docstring_parser copied to clipboard

Google style docstring and sphinx directives issue

Open Andry-Bal opened this issue 3 years ago • 8 comments

The following code snippet causes everything after Example: to be parsed as long_description. As a consequence, params and raises are lost.

from docstring_parser import parse

if __name__ == '__main__':
    docstring = parse(
        """
        Short description

        Long description spanning multiple lines
        - First line
        - Second line
        - Third line

        Example:
        
        .. testcode::
        
            foo = bar
            bar = foo

        Args:
            name: description 1
            priority: description 2
            sender: description 3
        
        Raises:
            IOError: some error
        """)
    print(docstring.short_description)
    print(docstring.long_description)
    print(docstring.params)
    print(docstring.raises)

Andry-Bal avatar Jul 19 '21 16:07 Andry-Bal

Where can I read more about this syntax? The reason it gets parsed as long_description is because the Google parser crashes, thinking Example: is an empty section, and .. testcode:: is a separate unknown section, and docstring_parser chooses the next available parser.

rr- avatar Jul 21 '21 21:07 rr-

The crash is now fixed in 0.9.1, although the example won't parse properly.

rr- avatar Jul 21 '21 21:07 rr-

@rr- you can read about this syntax in https://www.sphinx-doc.org/en/master/usage/extensions/doctest.html. @Andry-Bal a question regarding the example. Shouldn't the .. testcode:: block be indented one more level so that it is clear that it is part of the Example: section? I have never done this so don't really know how it should go. Just asking.

mauvilsa avatar Jul 24 '21 09:07 mauvilsa

@mauvilsa Frankly, I have not used it either, so I am not sure what is the proper way to indent it. However, it is used this way in e.g. here, so I assume it is a valid usage, but I can be wrong.

Andry-Bal avatar Jul 26 '21 11:07 Andry-Bal

Looking at https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings, which I guess is the definition of the google style, the only defined sections are Args:, Returns:, Yields: and Raises:. According to this the example in the code snippet above is just a part of the long description, not a separate section. Thus the indentation would be correct. @rr- can you please clarify about the sections.

mauvilsa avatar Jul 26 '21 14:07 mauvilsa

I would say it should be indented by four spaces like the sections under Args:, Returns etc. However if it's a popular practice then we should support it in this library by extending the parser to look for text that starts with ...

rr- avatar Jul 26 '21 14:07 rr-

Looking at the sphinx napoleon extension (which is what is used to handle google style docstrings), I think Example: is considered an independent section, see napoleon.html#docstring-sections. There is also an example in which the content is indented with respect to the section title, see example_google.html#example-google. I also tested the generation of sphinx documentation with an example having an indented code block and it works correctly. Based on this I would say that the correct way would be to have the indentation.

On another note I think that docstring-parser should have the same behavior as napoleon. With the wrong indentation the example content is not lost. Not sure how it works internally but in a rendered html it looks like Example: is not considered a section (since it would be empty) and just becomes part of the long description including the code snippet. I could be wrong. The source code I think is https://github.com/sphinx-contrib/napoleon/blob/master/sphinxcontrib/napoleon/docstring.py. In any case I think the parser should preserve all content somewhere, not just if it starts with something special like ...

mauvilsa avatar Jul 27 '21 07:07 mauvilsa

In Griffe we had similar false-positives in Google-style docstrings: matching sections which weren't. I've fixed that by setting stricter rules for section matching in the parser. A section is only a section if:

  • there's a blank line before the section header
  • there's no blank line after the section header
  • the section contents are indented

See https://mkdocstrings.github.io/griffe/docstrings/#google-style.

For example, Ruff implemented rules D411 and D412 coming from pydocstyle itself.

This is a tricky situation though: sometimes the user did want to write a section, and used incorrect spacing, and sometimes the user did not want to write a section, and it should then be parsed as regular markup. In Griffe we don't warn or error out on incorrect section syntax, we only log a debug message saying "if you wanted a section, here's what's wrong with your syntax".

pawamoy avatar Oct 27 '23 12:10 pawamoy