feature: Support for accessors
Is your feature request related to a problem? Please describe.
If you document accessors, there isn't an obvious solution for how to show the namespace in which the accessor ends up in (as opposed to the original API). In the sphinx world, there is a package dedicated to handling this situation: https://github.com/xarray-contrib/sphinx-autosummary-accessors
Describe the solution you'd like
A similar thing for the mkdocs universe.
Describe alternatives you've considered
Additional context
I have written such an extension here: https://github.com/climate-resource/mkdocstrings-python-accessors. As you can see, it is super basic, but it seems to do what I want (see an example at the bottom of this page , with built docs here (note the namespace which is documented)).
Hi @znichollscr, thanks for the feature request!
So, if I understand correctly, pandas provides API that lets developers register "accessors", which are then available from pandas' API itself. Since users are expected to use the pandas API instead of the developer's accessor, developers want to document it as such in their API docs.
IMO it would better live as a Griffe extension rather than a mkdocstrings handler feature. Are you aware of Griffe's extension system / have you considered using it?
So, if I understand correctly, pandas provides API that lets developers register "accessors", which are then available from pandas' API itself. Since users are expected to use the pandas API instead of the developer's accessor, developers want to document it as such in their API docs.
Exactly.
Are you aware of Griffe's extension system / have you considered using it?
I did have a read through it and prototyped an extension. I was nervous about using that, because I wasn't sure what would happen if I changed the names used in Griffe's tree given that there is no code under the pandas API (so if Griffe goes looking for the source, it won't find anything which might affect source previews, for example). If you think that's not an issue/have suggestions for how to avoid this, I'd be very happy to hear them.
Thanks for the fast response and great package by the way!
I see, thanks. Yes this is a valid concern. I never had this use-case before where one wants to render something under a different name, under a different package even, while not actually appearing in this package's tree. Well, the extension could actually add objects into the pandas tree, but the filepath (used to get source among others) is a property. I wonder if the extension could simply override the filepath property with a fixed attribute. Mypy would probably complain, but I think it might work? Also there's the issue that pandas must be loaded before the extension can add objects to its tree.
Maybe the simpler solution here is really to just override the identifier of the object while rendering. Something akin to https://github.com/mkdocstrings/mkdocstrings/issues/725, but for the identifier itself instead of the heading title or ToC label.
::: path.to.my.registered.accessor
options:
identifier: pandas.dataframe.my_namespace
That could cause issues with autorefs though.
I'll have a look at your custom handler to see exactly how you render it, to see if there's a way to make this all work in a generic way 🙂
Thanks for the fast response and great package by the way!
You're welcome! Thank you for the kind words 💟
Oh OK you simply overwrite the name of Griffe objects. I think that will give you incorrect paths (ids) and therefore incorrect inventory entries, as well as incorrect autorefs (making your objects un-reference'able).
Very true. Once you think about it, it's actually quite a complex problem. The solution we have definitely will have plenty of drawbacks.
Well, the extension could actually add objects into the pandas tree, but the
filepath(used to getsourceamong others) is a property. I wonder if the extension could simply override thefilepathproperty with a fixed attribute.
This might actually be the only way to do this. I guess that would also better follow the accessor pattern in general, which does actually inject an API into another package's namespace (effectively). As you say though, you have to then somehow have a way to trace back to the original code because the API only exists at runtime, never statically (at least not without some clever stub work probably, maybe even not then, this does not bode well...).
Here's how I'd go about injecting the object in the pandas tree:
import ast
import copy
import griffe
class PandasAccessors(griffe.Extension):
def on_class_instance(self, *, cls: griffe.Class, agent: griffe.Visitor | griffe.Inspector, **kwargs):
try:
decorator = next(dec for dec in cls.decorators if dec.callable_path == "pandas.register_accessor")
except StopIteration:
return
namespace = ast.literal_eval(decorator.value.arguments[0])
pd_obj = copy.deepcopy(cls)
pd_obj.filepath = cls.filepath
agent.modules_collection["pandas.DataFrame"].set_member(namespace, pd_obj)
...then I'd run that and iterate to fix all the things that go wrong 😄
EDIT: deepcopy might be a bit brutal here, as objects are heavily linked together. Could end up copying a lot more than just the class and its members. Maybe it'd be better to re-create a class instance instead.
Very nice! I guess the other part of my issue is that I don't know the API well enough to think about I'd need to test to make sure everything works (I guess rendering, the ability to cross-reference, the correct objects end up in the inventory (which means other libraries could cross-reference this...)).