typeshed
typeshed copied to clipboard
Typing the ast.AST subclass constructors
We currently have static types for the fields of most (all?) ast
classes, but none of these have typed constructors, e.g.:
https://github.com/python/typeshed/blob/62cde013659b4714b6eefe14cecb4b6752b33570/stdlib/_ast.pyi#L79-L86
I think technically this might be because none of these subclasses actually have a unique constructor, but that shouldn't stop us from typing each of the constructors, since in practise the constructor arguments must correspond to the class fields.
Before I do this, though, I'm firstly wondering if an easy solution here would be to apply the dataclass_transform
decorator (note: not the same as the dataclass
decorator). This would simply tell the type checker that all the class fields can and should be provided in the constructor, which is broadly correct. However I don't have a deep understanding of how the ast.AST
constructor works, so this might not be the correct behaviour. For ClassDef
this might look like:
from typing_extensions import dataclass_transform
@dataclass_transform
class ClassDef(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("name", "bases", "keywords", "body", "decorator_list")
name: _Identifier
bases: list[expr]
keywords: list[keyword]
body: list[stmt]
decorator_list: list[expr]
If this isn't sufficient, I propose that we simply add an __init__()
stub to each subclass. For instance, for the ClassDef
above, this might look like:
class ClassDef(stmt):
if sys.version_info >= (3, 10):
__match_args__ = ("name", "bases", "keywords", "body", "decorator_list")
name: _Identifier
bases: list[expr]
keywords: list[keyword]
body: list[stmt]
decorator_list: list[expr]
def __init__(
name: _Identifier
bases: list[expr]
keywords: list[keyword]
body: list[stmt]
decorator_list: list[expr]
):
...
I'd be in favour of adding __init__
stubs to each subclass. PR welcome!
Thoughts on the dataclass_transform
idea?
I would prefer explicit __init__
s, since typically typeshed prefers to mirror closely what's happening at the runtime, rather than getting fancy and lying. Also not sure how well supported PEP 681 currently is by type checkers — last I checked mypy doesn't yet support it.
(Note dataclass_transform would have to be applied to a base class to work, not to the class itself. You'd also want to be careful to get all the params right, e.g. we shouldn't synthesise comparison methods and stuff)
Invoking the constructors seems fragile to me. For example, if you instantiate ast.Module
, it's easy to forget passing in a second argument and end up with a Module
that doesn't have a type_ignores
attribute:
>>> ast.dump(ast.parse('f()'))
"Module(body=[Expr(value=Call(func=Name(id='f', ctx=Load()), args=[], keywords=[]))], type_ignores=[])"
>>> ast.parse('f()').type_ignores
[]
>>> ast.Module(body=[]).type_ignores
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Module' object has no attribute 'type_ignores'
You can even make an empty module that doesn't have any attributes:
>>> ast.dump(ast.Module())
'Module()'
>>> ast.dump(ast.Module([]))
'Module(body=[])'
>>> ast.dump(ast.Module([], []))
'Module(body=[], type_ignores=[])'
So code that instantiates AST nodes will break every time a new attribute is added. Maybe it means that you shouldn't do this and the type checker should error if you do this, or maybe it just means that type checking is unusually important here.
@Akuli I interpret it as type checking being particularly important for this module, which is why I raised this issue. imo ideally the actual constructor would validate the arguments too, but it doesn't seem to do that, and I don't want to touch the C code that provides the constructor, so I'm instead aiming at a type annotated constructor.