pytkdocs
pytkdocs copied to clipboard
[BUG] pandera model wrongly detected as pydantic and pytkdocs tries to read non existent attributes
First of all, thanks for developing pytkdocs!
I do not use pytkdocs directly, but rather mkdocs and mkdocstrings which call pytkdocs. Here is an example for reproduction.
import pandera as pa
from pandera.typing import DataFrame
from pandera.typing import Series
class Foo(pa.DataFrameModel):
"""
Some description
"""
bar: Series[int]
cause_error = DataFrame[Foo]({"bar": [1,2,3]})
Without any instantiated code (that actually uses the panderas models) it runs just fine. But, as soon as I USE the models somewhere in some precomputed objects pytkdocs runs into the following errors:
ERROR - mkdocstrings: 'tuple' object has no attribute 'required'
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/pytkdocs/cli.py", line 205, in main
output = json.dumps(process_json(line))
File "/usr/local/lib/python3.10/dist-packages/pytkdocs/cli.py", line 114, in process_json
return process_config(json.loads(json_input))
File "/usr/local/lib/python3.10/dist-packages/pytkdocs/cli.py", line 91, in process_config
obj = loader.get_object_documentation(path, members)
File "/usr/local/lib/python3.10/dist-packages/pytkdocs/loader.py", line 358, in get_object_documentation
root_object = self.get_module_documentation(leaf, members)
File "/usr/local/lib/python3.10/dist-packages/pytkdocs/loader.py", line 426, in get_module_documentation
root_object.add_child(self.get_class_documentation(child_node))
File "/usr/local/lib/python3.10/dist-packages/pytkdocs/loader.py", line 544, in get_class_documentation
self.add_fields(
File "/usr/local/lib/python3.10/dist-packages/pytkdocs/loader.py", line 612, in add_fields
root_object.add_child(add_method(child_node))
File "/usr/local/lib/python3.10/dist-packages/pytkdocs/loader.py", line 712, in get_pydantic_field_documentation
if prop.required:
AttributeError: 'tuple' object has no attribute 'required'
It seems these pandera models are detected as pydantic, but they do not have the same attributes. For proper pydantic classes we have this:
from pydantic import BaseModel
class Test(BaseModel):
i: int
Test.__fields__["i"]
# yields True
For pandera models we seem to have this
import pandera as pa
from pandera.typing import Series
from pandera.typing import DataFrame
class Foo(pa.DataFrameModel):
bar: Series[int]
Foo.__fields__
# is {}
foo = DataFrame[Foo]({"bar": [1,2,3]})
# after instantiating things exist and probably that's why it causes errors in pytkdocs
Foo.__fields__["bar"]
# <pandera.typing.common.AnnotationInfo at 0x7f...>, <pandera.api.pandas.model_components.FieldInfo("bar") object at 0x7f1...>)
pytkdocs 0.16.1, python 3.10.7, Linux pydantic 1.10.11 with pydantic_core 2.1.2 (pandera imposes a restriction of pydantic <2) pandera 0.15.2
Hello, thanks for the report.
Do you use the legacy handler by necessity? Out of curiosity, is there something preventing you from using the new handler?
Hi @pawamoy, thanks for pointing out that we are using a legacy handler. I guess there was a phase where our docs were not supported yet so I kept working with mkdocstrings[python-legacy]. Upon upgrade to griffe things also break. I get runtime errors (IndexError: list out of range) inside griffe without any (for me readable) information on what is wrong. I guess I will need to cook up a minimal example and post it as issue for griffe :(
That would be great if you could report these issues you get indeed. If your repo is public, I can also use it to investigate (this way you don't need to create a minimal example).
I bet the index errors come from how we parse Returns section in docstrings. Try indenting continuation lines once more:
Returns:
A long description
of the return value.
Blah blah blah.
->
Returns:
A long description
of the return value.
Blah blah blah.
Unless you're not using Google docstrings?
That worked indeed. At least there are no runtime errors anymore. I did notice a change though from pytkdocs to griffe. Before, if I had submodules, pytkdocs would list them in the documentation. Now it really only lists the main module, and not even any objects that I import from submodules. So I guess I will have to add pages individually for these submodules or get the automatic reference creation to work (is this still the best approach: https://mkdocstrings.github.io/recipes/ ?)
Still the best approach, yes. And you can use show_submodules: true to render every submodule (see https://mkdocstrings.github.io/python/usage/configuration/members/#show_submodules). We changed the default from true to false between the legacy and new handler.
Great. Thanks for pointing that out. It worked just fine. However, submodules that have a function with the same name in it will not be processed (they don't show up in the documentation). So e.g. if you had package.foo as submodule that has a function foo in it, the entire package.foo submodule will be excluded. The same if your package was already called foo and you had a foo submodule (foo/foo.py) in it, the package will not be rendered.
Yes, these are known issues and we plan to alleviate them. Note that wildcard imports make the situation worse, I recommend avoiding them.