pylance-release
pylance-release copied to clipboard
Google docstring formatting for multi-line class attributes not recognized/converted properly for use in intellisense popup
Type: Bug
Behaviour
Expected vs. Actual
I apologise if this should be opened in a different repository. I am running this code with a Python 3.10.7 interpreter.
Take the following code:
from dataclasses import dataclass
@dataclass
class Foo:
"""Does the thing.
Attributes:
bar (int): A short description
buzz (str): A longer description that needs to be broken up over multiple lines,
because there is additional context that I want to provide, and I don't
want to go past the 88 line length limit for whatever reason.
"""
bar: int
buzz: str
Note: the above docstring split over multiple lines is valid as far as Google's own style-guide is concerned. The docs for the class Attributes section points to the function Args section and says that it follows the same ruleset. Quoting below (emphasis mine):
List each parameter by name. A description should follow the name, and be separated by a colon followed by either a space or newline. If the description is too long to fit on a single 80-character line, use a hanging indent of 2 or 4 spaces more than the parameter name (be consistent with the rest of the docstrings in the file). The description should include required type(s) if the code does not contain a corresponding type annotation. If a function accepts *foo (variable length argument lists) and/or **bar (arbitrary keyword arguments), they should be listed as *foo and **bar.
With that in mind, when I go to type out a variable that initializes an instance of that class, I see that the bar variable definition is "hoisted" to the top of the intellisense pop-up, as expected. However, the multi-line definition for the second parameter has an awkward line break:

I could probably get over the mangling of the multi-line string in the pop-up, but the behavior is a bit more disappointing when you start typing the variable that has that multi-line docstring:

In the above image, we can see that the pop-up does not show the full docstring for the parameter in question.
Ideally what would happen would be that:
- if you're writing out a docstring for a parameter and you've purposefully added a newline, VSCode would recognize it as a continuation of the previous line as long as the current line is indented in some uniform way (2/4 spaces).
- If you are using intellisense and filling out a parameter that has this newline added (at the correct indentation level), you should be able to see the full documentation for that parameter (again, as long as the further lines are indented from the original parameter's "level").
- Additionally, it would be great if one could create multiple paragraphs for a single parameter, so that documentation for specific parameters could be a bit more rich. This could be recognized if there are multiple line breaks, but the new paragraph would still need to be at the same indentation level to be considered part of the parameter it is documenting (meaning, 2/4 spaces indented)
Steps to reproduce:
- Create the class as shown above, with
pydocstyleturned on with--convention=googlepassed. - Review the Intellisense popup as you autocomplete the class initialization.
Diagnostic data
- Python version (& distribution if applicable, e.g. Anaconda): 3.10.7
- Type of virtual environment used (e.g. conda, venv, virtualenv, etc.): Venv
- Value of the
python.languageServersetting: Pylance
Output for Python in the Output panel (View→Output, change the drop-down the upper-right of the Output panel to Python)
XXX
User Settings
languageServer: "Pylance"
linting
• pydocstyleArgs: "--convention=google"
• pydocstyleEnabled: true
formatting
• provider: "black"
Extension version: 2022.14.0 VS Code version: Code 1.71.1 (e7f30e38c5a4efafeec8ad52861eb772a9ee4dfb, 2022-09-08T19:41:58.611Z) OS version: Darwin arm64 21.6.0 Modes: Unsupported Sandboxed: No
System Info
| Item | Value |
|---|---|
| CPUs | Apple M1 Pro (10 x 24) |
| GPU Status | 2d_canvas: enabled canvas_oop_rasterization: disabled_off direct_rendering_display_compositor: disabled_off_ok gpu_compositing: enabled metal: disabled_off multiple_raster_threads: enabled_on opengl: enabled_on rasterization: enabled raw_draw: disabled_off_ok skia_renderer: enabled_on video_decode: enabled video_encode: enabled vulkan: disabled_off webgl: enabled webgl2: enabled webgpu: disabled_off |
| Load (avg) | 5, 8, 10 |
| Memory (System) | 32.00GB (0.04GB free) |
| Process Argv | --crash-reporter-id 07e484e5-ada3-4296-adea-39d6c26a0c7c --crash-reporter-id 07e484e5-ada3-4296-adea-39d6c26a0c7c |
| Screen Reader | yes |
| VM | 0% |
Any update on this? I'm having the same problem
As an aside, if you put \ at the end of the lines that are continued, you get the expected results. This does not seem appropriate, but it might give some insight into this bug.
For example:
buzz (str): A longer description that needs to be broken up over multiple lines, \
because there is additional context that I want to provide, and I don't \
want to go past the 88 line length limit for whatever reason.
I've had the same issue for method parameters, only on the parameter intellisense display not the method itself (there the multiline params display even though the formatting is not perfect).
As an aside, if you put
\at the end of the lines that are continued, you get the expected results. This does not seem appropriate, but it might give some insight into this bug.For example:
buzz (str): A longer description that needs to be broken up over multiple lines, \ because there is additional context that I want to provide, and I don't \ want to go past the 88 line length limit for whatever reason.
I agree it does fix the problem but it's not part of the official Sphinx docs so yeah might not be appropriate agreed.
This issue has been fixed in prerelease version 2024.6.100, which we've just released. You can find the changelog here: CHANGELOG.md
The fix for this issue is behind an experimental feature flag:
"python.analysis.supportRestructuredText": true
It looks like we missed something though. The parameter raising only happens if the parameters are in a Parameters section and not an Attributes section. We've logged a separate bug to address that problem, but otherwise parameters should work across multiple lines.
@rchiodo Why is did we choose "Parameters" as a section? That's not even in the documentation for Google's docstring format from what I can tell 🤔
I'll give it a test today to see if it handles things the way I expect though.
We chose Parameters because that's a common section header in a lot of docstrings. Like Numpy docstrings. The logic for the parameter hoisting was looking for 'parameters' to hoist and that's how it finds them. Args (like google docs) would have worked too. Attributes doesn't.
The example you posted is actually the docstring we apply to an __init__ function if there's no docstring on the __init__, so it needs to special case other sections too.
Here's it also working with Args as the section header:
I'd like to also point out that most of the work for this was done using a parser for restructured text. Although close in many regards, restructured text is not Google doc string format. We're using this parser here:
https://stsewd.dev/tree-sitter-rst/
If you paste your docstring into that online parser, you'll see the AST we generate. We're trying to convert google docstrings into restructured text before parsing, but there's likely things that don't work well (and it's not always easy to tell google docstrings from just personal preferences for how to write docstrings).
This is an example:
Returns:
A dict mapping keys to the corresponding table row data
fetched. Each row is represented as a tuple of strings. For
example:
{b'Serak': ('Rigel VII', 'Preparer'),
b'Zim': ('Irk', 'Invader'),
b'Lrrr': ('Omicron Persei 8', 'Emperor')}
Returned keys are always bytes. If a key from the keys argument is
missing from the dictionary, then that row was not found in the
table (and require_all_keys must have been False).
That generates the following AST:
That AST gives us the following markdown:
Which is obviously off.
However if we modify the docstring to be what I think it would be intended in RST:
Returns
--------
A dict mapping keys to the corresponding table row data
fetched. Each row is represented as a tuple of strings. For
example::
{b'Serak': ('Rigel VII', 'Preparer'),
b'Zim': ('Irk', 'Invader'),
b'Lrrr': ('Omicron Persei 8', 'Emperor')}
Returned keys are always bytes. If a key from the keys argument is
missing from the dictionary, then that row was not found in the
table (and require_all_keys must have been False).
We generate correct markdown:
Given the number of ways people write docstrings, RST seemed like a more flexible endpoint, so we chose it as the basis for our parsing. We're hoping we can make it work for most other formats too, but as you can see it's not there yet.
I appreciate the context @rchiodo, much appreciated!
We chose Parameters because that's a common section header in a lot of docstrings. Like Numpy docstrings. The logic for the parameter hoisting was looking for 'parameters' to hoist and that's how it finds them. Args (like google docs) would have worked too. Attributes doesn't.
Does numpy say that it follows Googles docstring format? Or is it doing its own thing?
The documentation that Google provides does not say that "Args" is a valid section for a class docstring, it's intended to just be used in method/function docstrings. "Attributes" is called out as what is used for class properties/attributes in class docstrings. "Parameters" is not mentioned as a valid section. I just want to make sure that this handles Google's docstring format, which is what I opened this issue about (not Numpy's format).
I totally get that there might be some implementation issues related to dealing with Google docstrings, but unless the changes that have been made properly handle them, I don't think this issue should be closed.
That being said, I will wait for that bug to be fixed and this feature is no longer "experimental" to see if all of my concerns are addressed. Thank you for taking the time to make this better!
Does numpy say that it follows Googles docstring format? Or is it doing its own thing?
Numpy follows sphinx formatting which uses restructured text.
Edit: It seems Sphinx supports restructured text and google. Their default is restructured but their docs also mention support for google too. It would be interesting to see how they detect the difference, although I suppose we could just offer a mode or something.
Here's the issue about attributes hoisting: https://github.com/microsoft/pylance-release/issues/6012