modin icon indicating copy to clipboard operation
modin copied to clipboard

BUG: docs extension: unspecified subclass documentation causes superclass documentation to be overridden

Open mvashishtha opened this issue 11 months ago • 5 comments

e.g. if I add custom documentation for DataFrameGroupBy.mean but not for SeriesGroupBy.mean, then SeriesGroupBy.mean and DataFrameGroupby.mean are the same object, and we override that object's __doc__ with the pandas docstring when we define SeriesGroupBy.mean.

one solution: explicitly define methods like SeriesGroupBy.mean so that they become distinct from the superclass methods.

mvashishtha avatar Mar 22 '24 21:03 mvashishtha

update: seems like a better solution is to have the SeriesGroupBy docstring class inherit from the DataFrameGroupBy docstring class.

sfc-gh-mvashishtha avatar Mar 27 '24 16:03 sfc-gh-mvashishtha

To clarify the behaviors with a DocModule:

  1. if there's a DocModule, and that module defines a subclass like SeriesGroupBy, but that subclass is missing a docstring for a method, then we'll override with the pandas docstring the docstring of the base class and subclass method (both are the same). We make the decision to override here.
  2. If the DocModule does not define a subclass at all, then we won't override the docstring entirely, but we will add the apilink for the subclass, even when that doesn't make sense. e.g. currently I have a docstring class defining the astype docstring for BasePandasDataset, but I don't have a DataFrame or Series class. We get the custom docstring we want with no apilink for the base astype, but then Series astype adds the apilink to Series because Series is defined with _inherit_docstrings(overwrite=False, apilink="pandas.Series"). When we go through the inheritance for DataFrame , we know not to change the docstring at all. The end result is that the docstring for BasePandasDataset.astype, DataFrame.astype, and Series.astype ends with See pandas API documentation for pandas.Series.astype https://pandas.pydata.org/pandas-docs/version/2.1.4/reference/api/pandas.Series.astype.html_ for more.

not sure about how to solve these; will come back to this soon.

mvashishtha avatar Apr 02 '24 23:04 mvashishtha

the desired behaviors in both scenarios:

  1. The current behavior is correct. The subclass in the docs module should inherit from the parent class in the docs module.
  2. Between DocModule.put() calls, we shouldn't replace docs twice for the same object. that creates too much confusion.

mvashishtha avatar Apr 03 '24 18:04 mvashishtha

I guess the fix for (2) will change the behavior of (1) so that inheriting from a superclass is not necessary. That makes sense to me.

sfc-gh-mvashishtha avatar Apr 10 '24 20:04 sfc-gh-mvashishtha

NOTE when testing this, we have to test both methods and properties, which we override by replacing the original property.

sfc-gh-mvashishtha avatar Apr 19 '24 22:04 sfc-gh-mvashishtha