cpython icon indicating copy to clipboard operation
cpython copied to clipboard

`pydoc` renders `from builtins.type` note, even if it is incorrect

Open sobolevn opened this issue 2 years ago • 5 comments

While working on https://github.com/python/cpython/pull/97958 I've noticed that there's something strange with help() and classmethods.

Take a look at this example:

import pydoc

class My:
    @classmethod
    def __init_subclass__(cls, *args, **kwargs):
        pass

    @classmethod
    def custom(cls):
        pass

print(pydoc.plain(pydoc.render_doc(My)))

It prints:

Python Library Documentation: class My in module __main__

class My(builtins.object)
 |  Class methods defined here:
 |
 |  __init_subclass__(*args, **kwargs) from builtins.type
 |      This method is called when a class is subclassed.
 |
 |      The default implementation does nothing. It may be
 |      overridden to extend subclasses.
 |
 |  custom() from builtins.type
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |
 |  __dict__
 |      dictionary for instance variables (if defined)
 |
 |  __weakref__
 |      list of weak references to the object (if defined)

Take a look at these two entries:

  1. __init_subclass__(*args, **kwargs) from builtins.type
  2. custom() from builtins.type

While type has __init_subclass__, there's no type.custom. But, help says that there is!

>>> type.__init_subclass__
<built-in method __init_subclass__ of type object at 0x10a50c360>
>>> type.custom
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'type' has no attribute 'custom'

I think that it is incorrect and can lead to confusion. Instead it should be:

 |  __init_subclass__(*args, **kwargs) from builtins.type
 |      This method is called when a class is subclassed.
 |
 |      The default implementation does nothing. It may be
 |      overridden to extend subclasses.
 |
 |  custom()

sobolevn avatar Oct 06 '22 12:10 sobolevn

A couple of extra thoughts after reading some more source code: https://github.com/python/cpython/blame/e39ae6bef2c357a88e232dcab2e4b4c0f367544b/Lib/pydoc.py#L1032-L1042

        if _is_bound_method(object):
            imclass = object.__self__.__class__
            if cl:
                if imclass is not cl:
                    note = ' from ' + classname(imclass, mod)

Ok, looks like this is in sync with the following example:

import pydoc
from types import MethodType

class My:
    some = MethodType(some, 1)

print(pydoc.plain(pydoc.render_doc(My)))

which outputs:

Python Library Documentation: class My in module __main__

class My(builtins.object)
 |  Methods defined here:
 |
 |  some(*args, **kwargs) from builtins.int

I think that classmethods should just completely skip this part. It brings more confusion than value. So, it should be:

|  __init_subclass__(*args, **kwargs)
 |      This method is called when a class is subclassed.
 |
 |      The default implementation does nothing. It may be
 |      overridden to extend subclasses.
 |
 |  custom()

sobolevn avatar Oct 06 '22 14:10 sobolevn

Good catch.

Why this "from ..." was added in the first case? In what cases it was useful.

serhiy-storchaka avatar Oct 10 '22 16:10 serhiy-storchaka

I found only one: when MethodType is defined in a class body:

class Some:
    a = MethodType(some_func, some_instance)

Plus, Enum uses from enum.EnumType on some methods. I am still testing other things.

sobolevn avatar Oct 10 '22 16:10 sobolevn

I found only one: when MethodType is defined in a class body:

But where is it used in the real code?

If make this code raising an error instead of adding "from ..." and run pydoc for all stdlib modules, what will fail? Investigate also the history of that code, commit message or issue text can explain its purpose.

serhiy-storchaka avatar Oct 10 '22 16:10 serhiy-storchaka

So, I got back to it. Here's what I found.

test results

Running pydoc tests with raise ValueError instead of note = ' from ' + classname(imclass, mod) showed this:

2 tests failed:
    test_enum test_pydoc

Only pydoc itself and enum are affected. Basically, the same modules I've changed in https://github.com/python/cpython/pull/98120

history

There are two commits that touch this line:

  1. 15 years ago, when unbound methods were removed: https://github.com/python/cpython/commit/ff737954f3ee3005236133fc51b55a508b11aa06
  2. 22 years ago, when https://github.com/python/cpython/commit/b7a48300cd653964c82ae34eae9fab9bebef3578 was merged. It does not have a description. Its title is Fix linking to classes (in class tree, and add links on unbound methods)

So, I can conclude that these lines have something to do with old unbound methods.

= MethodType() usage

There are some usage of it:

» ag '= MethodType' 
Lib/test/test_importlib/import_/test_caching.py
60:        mock.load_module = MethodType(load_module, mock)

Lib/test/test_call.py
754:                meth = MethodType(func, args[0])

But, lets think about whether is this important to show from ... part. Compare these two (with and without from):

>>> import types
>>> class A:
...    x = types.MethodType(some, 1)
... 
>>> help(A)
Help on class A in module __main__:

class A(builtins.object)
 |  Methods defined here:
 |
 |  x = some(*args, **kwargs) from builtins.int
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:

Without:

>>> help(A)
Help on class A in module __main__:

class A(builtins.object)
 |  Methods defined here:
 |
 |  x = some(*args, **kwargs)
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
  1. Has anything really changed?
  2. Will users be able to decrypt that x is a MethodType bound to int from this short message? I don't think so. I've spent several hours trying to decode what it means and how it works.

lib usage

Let's take a look at the enum case: it kinda is using this feature.

 |  ----------------------------------------------------------------------
 |  Methods inherited from enum.EnumType:
 |
 |  __contains__(value) from enum.EnumType
 |      Return True if `value` is in `cls`.
 |
 |      `value` is in `cls` if:
 |      1) `value` is a member of `cls`, or
 |      2) `value` is the value of one of the `cls`'s members.
 |
 |  __getitem__(name) from enum.EnumType
 |      Return the member matching `name`.
 |
 |  __iter__() from enum.EnumType
 |      Return members in definition order.
 |
 |  __len__() from enum.EnumType
 |      Return the number of members (no aliases)
 |
 |  ----------------------------------------------------------------------

Is it useful here? Again, I don't think so: we already have Methods inherited from enum.EnumType: part which clearly says where these methods come from.

In my opinion, it brings more confusion than value.

solution

I think that we should at least protect @classmethod from been reported incorrectly, as I did in https://github.com/python/cpython/pull/98120

Or we can completely remove note = ' from ' + classname(imclass, mod) part, because:

  1. It is an artifact from another era: when unbound methods were a thing
  2. It does not bring any value to existing Lib/ code
  3. It is quite cryptic to understand for regular users
  4. It is easy to do, nothing will probably break because of that

@serhiy-storchaka what do you think? :)

sobolevn avatar Nov 07 '22 19:11 sobolevn

I do not see any value in the from some_module.some_type. I'd like to see it removed.

ethanfurman avatar Dec 09 '22 23:12 ethanfurman

Not sure if this is related, but help(enum.Enum) shows:

Help on class Enum in module enum:

class Enum(builtins.object)
 |  . . .

It seems to me that instead of builtins.object that should say enum.EnumType.

ethanfurman avatar Dec 09 '22 23:12 ethanfurman

This issue is much much more complex than it looked at first glance. If you look at the code, you will see other issues with notes for routines. For example an "unbound method" note is never added to unbound methods. It does not work with methods of builtin classes. It seems that this code was written to handle classic classes, and the support of new-style classes was never complete. Some changes was made later to fix runtime errors, but it was not checked that the result is correct and that all cases are supported.

Other issue, is that class members that are bound methods were included in the methods section. But they behave rather like static methods, because they do not accept the self argument.

#113941 fixes these and other issues.

There are still unfixed issues, for example nested classes are not supported (#85799), and unbound methods of builtin classes are not visible as module members (#113942), but this PR is already large.

serhiy-storchaka avatar Jan 11 '24 13:01 serhiy-storchaka

About the enum.Enum case. It was:

  |  Methods inherited from enum.EnumType:
  |
  |  __contains__(value) from enum.EnumType
  |      Return True if `value` is in `cls`.

First, "from enum.EnumType" is redundant, because there is "from enum.EnumType" in the section title. Usually "from" notes are not repeated for every method.

Second, the section title is not completely correct. They are not normal methods. The first argument of normal arguments is self (whatever its name). You cannot call enuminstance.__contains__(). You should call enumclass.__contains__(value) -- this is the behavior of static or class method. It is difficult to distinguish them at that level, so for now they are qualified as static methods.

BTW, for the "from" example look at the enum.StrEnum help. Now it contains:

 |  Methods defined here:
...
 |  __repr__(self) from enum.Enum
 |      Return repr(self).
 |
 |  __str__(self, /) from builtins.str
 |      Return str(self).

Note that __str__ is "from builtins.str", and __repr__ is "from enum.Enum", despite both are "methods defined here". You can see that __str__ is the same as for strings, and __repr__ is implemented in Enum, and that they override methods inherited from base classes.

serhiy-storchaka avatar Feb 11 '24 18:02 serhiy-storchaka