numpydoc
numpydoc copied to clipboard
Support type annotations (PEP 484)
With support for Python 2.7 being dropped in a number of projects, some might consider using type annotations from PEP 484.
In particular, see discussions in https://github.com/numpy/numpy-stubs, https://github.com/pandas-dev/pandas/issues/14468 https://github.com/scipy/scipy/issues/9038 https://github.com/scikit-learn/scikit-learn/issues/11170
A recurrent issue when using type annotations is that one has then to keep consistency between types in the docstring and type annotations which is extra work and can be tedious.
It would be great if numpydoc could take the type information from annotations. For instance,
def func(x):
"""Some function
Parameters
----------
x : int
Description of parameter `x`.
"""
could optionally be written as,
def func(x: int):
"""Some function
Parameters
----------
x
Description of parameter `x`.
"""
and generate the same HTML.
To keep backward compatibility this mechanism could be enabled with an optional config flag, and only be used if the type is not provided in the docstring.
For docstring formatting without numpydoc this functionality is implemented in sphinx-autodoc-typehints (that also works with sphinx.napoleon). So maybe integration with that extension could be a possibility.
This would also have the added benefit that type annotations, unlike docstrings types, can be programmatically checked for consistency (e.g. with mypy)..
Interesting topic! Two remarks:
- When using such an "inject by sphinx" approach, that would mean that the docstring is not updated with the typing information in interactive sessions?
- For me, typing annotations are more oriented to developers (and their tools) / advanced users, and not that newbie friendly (eg I find a type description of
str or float
more friendly thanUnion[str, float]
)
(It might be something each project needs to discuss and decide for themselves if they find those items important or not, but still, interested to see what others think)
I'm sure if we wanted, we could find a way to inject types into docstrings at module load time, making sure the types are available in pydoc etc. Apart from being magic, this would entail a runtime dependency on numpydoc (but not necessarily sphinx)
When using such an "inject by sphinx" approach, that would mean that the docstring is not updated with the typing information in interactive sessions?
As Joel said it's possible, but a broader question is whether we want to repeat types in the docstring if they are already shown in the function signature. For instance, if we take modern typed languages, Rust docstrings don't repeat that information neither does say C++ xtensor project (which seem to use sphinx BTW).
Having this issue closed, with "it's a better practice not to repeat type information in the docstring" is also a possibility. I was just interested in some feedback about this in general.
For me, typing annotations are more oriented to developers (and their tools) / advanced users, and not that newbie friendly (eg I find a type description of str or float more friendly than Union[str, float])
I agree. Transforming simple types to a more human-readable version in the docstring is a possibility, though the questions is what to do with more complex types (it could be possible to limit the deph say show Dict[str, Tuple[int]]
as just dict
.
This actually seems to be already implemented in napoleon https://github.com/sphinx-doc/sphinx/blob/master/doc/usage/extensions/napoleon.rst#type-annotations
Will try to investigate how to add something similar to numpydoc..
Instead of
def func(x: int):
"""Some function
Parameters
----------
x
Description of parameter `x`.
"""
I would love
def func(x: int):
"""Some function
Parameters
----------
x : Description of parameter `x`.
"""
An alternative approach mentioned by @ogrisel today was to have some migration tool that would take code with type annotations and update the docstring types in the corresponding python files .
If it happens I also think that numpydoc could be a good place for it. Somewhat related to docsting validation, but with inline modification of the code.
@rth Do you happen to know a tool which can do that?
I had a look at https://github.com/dadadel/pyment in February, but it had a couple of issues (e.g. https://github.com/dadadel/pyment/issues/84 ) which made it unusable for me.
I had understood @ogrisel to be suggesting a tool which would check compatibility between the docstring type and the annotation, rather than risk automatic generation.
A tool that checks compatibility would be great. Then one could easily use it as a pre-commit hook, for instance.
A tool that checks compatibility would be great. Then one could easily use it as a pre-commit hook, for instance.
Have you looked at darglint? It looks like it will do that.
I just checked darglint. It is exactly what we need for compatibility checking. The only problem is that it is a bit slow, at least for now.
Instead of a tool that checks if type annotations and types in docstrings are compatible, direct support for type annotations in numpydoc would be much better IMO. I'd love to be able to write a function as mentioned in the initial post. I think type annotations can also be useful for users (and not only developers), and if user-friendliness is an issue I like the suggestion to convert certain annotations to something more readable.
Any updates on this?
I think I also hit this roadblock.
I have something like,
def load_config_file(self, path: Union[str, pathlib.Path]) -> None:
"""
Load a configuration file in one of the supported formats, and merge it with
the current config if it exists.
Parameters
----------
path: Union[str, pathlib.Path]
config file to load. [yaml, toml, json, py] formats are supported
"""
then I use the numpydoc validator via sphinx-build and I get
pyswgo.core.tool.Tool.load_config_file:PR01:Parameters {'path'} not documented
pyswgo.core.tool.Tool.load_config_file:PR02:Unknown parameters {'pathlib.Path]', 'path: Union[str'}
pyswgo.core.tool.Tool.load_config_file:PR10:Parameter "path" requires a space before the colon separating the parameter name and type
pyswgo.core.tool.Tool.load_config_file:PR08:Parameter "path: Union[str" description should start with a capital letter
pyswgo.core.tool.Tool.load_config_file:PR09:Parameter "path: Union[str" description should finish with "."
pyswgo.core.tool.Tool.load_config_file:PR04:Parameter "pathlib.Path]" has no type
pyswgo.core.tool.Tool.load_config_file:PR08:Parameter "pathlib.Path]" description should start with a capital letter
pyswgo.core.tool.Tool.load_config_file:PR09:Parameter "pathlib.Path]" description should finish with "."
I am not sure what would be the best approach for you, but as a user I would at least like that this worked in a way that doesn't force me to exclude the associated validation checks codes as they are relevant for non-type-annotated functions.
I feel that a temporary solution numpydoc should be to at least recognize type annotations and have specific codes so you as developers can continue to think about how to properly address this and the user can safely disable any warning and still be sure that the rest of the API can be validated in a stable way.
It is unfortunate that no decision was made on this yet, as typing annotations become more and more popular. I would really prefer form which would minimize duplication of information, thus making code/docstrings less cluttered and possibly inconsistent. A few extra aspects worth considering (were not mentioned) while working it all out
Returns/Yields
How to annotate then Returns (and Yields) in particular in case of simple data types -- should it become just the description, e.g. instead of
Returns
-------
int
Description of anonymous integer return value.
(taken from https://numpydoc.readthedocs.io/en/latest/format.html#returns) to become just
Returns
-------
Description of anonymous integer return value.
if we are to avoid duplicating type annotation? or may be some "ad-hoc"
Returns
-------
_
Description of anonymous integer return value.
to signal unnamed entity, or even giving it some specific name thus aligning to the next example where names given in case of a tuple return.
Overloads
See https://docs.python.org/3/library/typing.html#typing.overload describing support for them. ATM AFAIK mypy has various issues with testing overloads (over 280 mention overload) and thus pragmatically the actual implementation might need to specify union of types to have some checking done (ref: as discovered in our case), so it might be even trickier but still -- an aspect to keep in mind.
In retrospect, IMO the best way is to go with sphinx-autodoc-typehints . I mean it already works with napoleon for numpy style docstrings, and it mentions numpydoc occasionally so it might just work with this package as well (though I haven't tested it).
I think it doesn't really make sense to start separate work on this topic, if there is already a popular package in the ecosystem that solves this problem well (and could work with numpydoc, unless there are indications that it's not the case).
@rth I just tried to use sphinx-autodoc-typehints with numpydoc and was not able to get a working configuration. I tried different configuration settings but the typehints did not show. While I agree that duplicate work should be prevented, currently there is no way which leads to the desired results.
Would you mind opening an issue about it at https://github.com/tox-dev/sphinx-autodoc-typehints to see what they think about it?