pydoctor icon indicating copy to clipboard operation
pydoctor copied to clipboard

Handling of constructors

Open mthuurne opened this issue 4 years ago • 11 comments

Currently we don't explicitly handle constructors in the output. Instead, we document methods like __init__() and __new__() and leave it up to the reader to figure out what the constructor arguments are (__init__() is not the constructor, but often has the same arguments as the constructor).

The standard library isn't consistent in documenting constructors: sometimes they're documented on the class itself, sometimes like a function named after the class, sometimes they refer to the __init__() documentation.

epydoc allows constructor arguments to be documented on the class. We currently just show these below the class docstring, but that is not consistent with classes where the constructor arguments are documented on __init__().

When using attrs, the __init__() method can be auto-generated. Should we also auto-generate the constructor documentation or leave it up to the reader to figure it out? If we ever decide to sort attributes instead of sticking to the source order, the reader wouldn't be able to figure it out anymore from just the docs.

There are classes that offer factory methods and don't want the user to call the constructor directly. It would be nice if there was a way to mark the default constructor as private, so it is hidden under the "show private API" button.

A common pattern is the following:

class C:
    def __init__(self, x: int):
        self.x = x

To properly document this, both the x constructor argument and the x instance variable need a docstring, even though the contents are the same. It would be nice if only one of them had to be provided and the other would automatically use the same docstring.

Note that in this exact case, attrs could be used to auto-generate the __init__() method, but that won't work if one argument is trivial like this example but another is non-trivial.

mthuurne avatar Dec 01 '20 22:12 mthuurne

I like the way Javadoc presents the constructors (i.e https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/io/File.html#%3Cinit%3E(java.io.File,java.lang.String)), more generally I like that Javadoc separates Constructor / Instance variables / Class variables / Methods / Package / Module with blocks and titles.

I think we could present the constructor information as it is used by people using the class i.e. not displaying __init__(*args) but C(*args). We could show this right after the summary table if the __init__ method has a docstring, else we show the __init__() method

If we ever decide to sort attributes instead of sticking to the source order, the reader wouldn't be able to figure it out anymore from just the docs.

We should say when a class is attrs based, same goes for zope.interface. May be a footer note ?

There are classes that offer factory methods and don't want the user to call the constructor directly. It would be nice if there was a way to mark the default constructor as private, so it is hidden under the "show private API" button.

I don't think that exists

tristanlatr avatar Dec 02 '20 00:12 tristanlatr

I like the way Javadoc presents the constructors (i.e https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/io/File.html#%3Cinit%3E(java.io.File,java.lang.String)), more generally I like that Javadoc separates Constructor / Instance variables / Class variables / Methods / Package / Module with blocks and titles.

I think it's a bit weird that the members are sorted in the summary tables, but not in the detailed documentation. But other than that, it looks good.

Maybe putting static/class variables and methods in a separate block makes sense.

I think we could present the constructor information as it is used by people using the class i.e. not displaying __init__(*args) but C(*args). We could show this right after the summary table if the __init__ method has a docstring, else we show the __init__() method

I'm also developing a preference towards C(*args).

Showing the constructor as the first method makes sense, since chronologically it's often the first one you interact with. By that logic, factory methods should perhaps be displayed first as well. Or maybe static/class methods in general, if we can't easily tell apart factory methods.

Why would it matter whether the __init__ method has a docstring though? We have the parameter list from the code inspection, so there is always something to report, even if it has no docstring.

If we ever decide to sort attributes instead of sticking to the source order, the reader wouldn't be able to figure it out anymore from just the docs.

We should say when a class is attrs based, same goes for zope.interface. May be a footer note ?

I think that it's already visible when a class uses zope.interface, through its base classes, the list of implemented interfaces, the class decorators.

Regarding attrs, is that an interface or implementation concern? If it's an implementation detail, I'm not sure we should even be mentioning it in the API docs.

There are classes that offer factory methods and don't want the user to call the constructor directly. It would be nice if there was a way to mark the default constructor as private, so it is hidden under the "show private API" button.

I don't think that exists

It's possible it doesn't exist yet, but we could add for example a docstring field to mark a constructor as private.

Or we could do that automatically if a factory method exists. We would need to test this first, to check how often the existence of a factory method means that direct constructor use is unwanted.

mthuurne avatar Dec 02 '20 03:12 mthuurne

A crazy idea. Use @constructor marker to let pydoctor know that a class method is a constructor.

If a class has one method with @constructor , move the __init__ to private API. To have both __init__ and other methods as constructors, add @constructor marker to the __init__ docstring.


I rarely use constructor class methods in my Python code. java as a language is a bit different in this regard.

adiroiban avatar Dec 02 '20 07:12 adiroiban

Why would it matter whether the init method has a docstring though?

It could an indication that it's private but now I think about it that wound't be consistent.

we could add for example a docstring field to mark a constructor as private

Is this a accepted PEP? I don't think that we should create a standards ourselves

Use @constructor marker to let pydoctor know that a class method is a constructor.

Same remark here

I rarely use constructor class methods in my Python code

There are still a lot of people that do actually

Maybe separating Class Methods / Constructor / Attributes / Methods (a fortiori separating Class Variables Modules and Packages in their own box for consistency) is the way to go to render Pages in a consistent way and at the same time that would make it clear if fabric methods are present or the default constructor should be used.

tristanlatr avatar Dec 02 '20 16:12 tristanlatr

A crazy idea. Use @constructor marker to let pydoctor know that a class method is a constructor.

What package would the constructor decorator be imported from though? Adding a runtime dependency on all of pydoctor would be wasteful, so we'd probably need to create a new package for this. Something similar to typing_extensions perhaps.

If we use a docstring field, there is no runtime impact. So I think that approach is simpler. Another approach which has very little runtime impact would be to define a __private_init = True class variable.

One reason to use a decorator would be if other code analyzers such as pylint would also want to use it, to warn about calls to private __init__ methods from outside the class.

I rarely use constructor class methods in my Python code. java as a language is a bit different in this regard.

I use them sometimes. I like to have immutable objects, which means that any method that changes something will return a new instance. In such classes, the __init__ method typically just gets all the new data and only contains self.x = x statements; it does no processing. But if creating an empty object requires processing or if the typical way to create a new object is by converting data from a different type, a factory method is the place where that processing happens.

we could add for example a docstring field to mark a constructor as private

Is this a accepted PEP? I don't think that we should create a standards ourselves

Docstring fields aren't standardized by any PEP, as far as I know. That's why we have so many different docstring formats ;)

Maybe separating Class Methods / Constructor / Attributes / Methods (a fortiori separating Class Variables Modules and Packages in their own box for consistency) is the way to go to render Pages in a consistent way and at the same time that would make it clear if fabric methods are present or the default constructor should be used.

That would certainly help, but even if we order things like that, I think it would be useful to have a way to mark an __init__ method as private (possibly indirectly by marking something else as a constructor).

mthuurne avatar Dec 02 '20 18:12 mthuurne

My bad... I was not talking about @construtor python decorator...but rather about using an epydoc marker.

adiroiban avatar Dec 02 '20 20:12 adiroiban

More generically would could introduce:

  • :private: (rst) and @private: (epytext)
  • :public: (rst) and @public: (epytext)

Tags that would force a documentable to be hidden or not.

tristanlatr avatar Dec 02 '20 20:12 tristanlatr

What I understand is

  • The class __init__ and/or __new__ and/or __call__ methods should be analyzed specially. We could introduce a new property: Class.constructor, a Method, it could render before the summary table for instance. We would populate the attribute while analyzing the AST, only when we can determiner for sure the constructor parameters. If we can't, we should fall back to the original behaviour.
  • We should remove all the self argument on method docs.
  • We should introduce new fields :private: (rst) and @private: (epytext), :public: (rst) and @public: (epytext), the @construtor: field could also be useful if the constructor is a class method
  • For attrs, we can parse the attr.ib calls and construct the constructor signature from there. OR maybe we could use the generated code (with inspect.getsource, but that would required us to load the class), only to parse the __init__ method.

As a reference here is the code that handles the constructor in the sphinx-autoapi extension: https://github.com/readthedocs/sphinx-autoapi/blob/master/autoapi/mappers/python/astroid_utils.py#L300

Let's note that they handle the overload constructors too, sweet. And it's all MIT ;-)

tristanlatr avatar Mar 26 '21 13:03 tristanlatr

I believe we can go with a solution that is based on heuristics and fall back to the old behavior if there is an ambiguity. I don’t think we need to implements public/private docstring fields at the moment to fix this issue.

Also, maybe we can fix it without the need of requiring a new attribute on the Class object, we can refine the current constructor_params property and use it to generate a nice constructor signature on top of the child table.

tristanlatr avatar Jun 04 '22 00:06 tristanlatr

Can any staticmethod or classmethod that has type hint indicating they're returning an instance of the self type be considered as a constructor? @adiroiban

Meaning:


class Options:
        a,b,c = None, None, None
        
        # pydoctor could infer a constructor to be: "Options.create()"
        @staticmethod
        def create() -> 'Options':
            return Options()
        
        # pydoctor could infer a constructor to be: "Options.create_from_num(num)"
        @classmethod
        def create_from_num(cls, num) -> 'Options':
            c = cls.create()
            c.a = num
            return c

tristanlatr avatar Sep 18 '22 18:09 tristanlatr

I came up with some changes that present the constructor signatures on top of the class page, with the extra informations:

by default 2022-09-19 at 11 37 14 AM

It doesn't handle all improvements listed by @mthuurne, but at least it presents the constructors like they should be used by developers.

tristanlatr avatar Sep 19 '22 15:09 tristanlatr