cattrs
cattrs copied to clipboard
Support ForwardRef
This adds ForwardRef support to cattr (See #201).
-
ForwardRefs in classes
@dataclass class Test: x: List["XType"] XType = int
-
Recursive classes
@dataclass class Node: data: Dict left: Optional["Node"] right: Optional["Node"]
-
Recursive type aliases. (Note: Even with PEP 563 we need ForwardRefs here)
# Tree-like structure Node = Tuple[ Dict, Optional["Node"], Optional["Node"] ] class UserClass: some_tree_data: Node
-
Recursive type aliases, across modules. (Needs explicit ForwardRef resolution, though. This is a Python limitation)
# module.py Node = Tuple[ Dict, Optional["Node"], Optional["Node"] ] cattr.resolve_types(Node, globals(), locals()) # main.py from module import Node class UserClass: some_tree_data: Node
-
Manual registration (this worked before, but is still supported.) This PR also fixes #206)
Node = Tuple[ Dict, Optional["Node"], Optional["Node"] ] c.register_structure_hook( ForwardRef("Node"), lambda obj, _: c.structure(obj, Node) ) c.register_unstructure_hook( ForwardRef("Node"), lambda obj: c.unstructure(obj, Node) )
Specifically, the PR does
-
Add hooks for ForwardRef (this equires ForwardRefs to be in resolved state, otherwise an error is generated)
-
Ensure that ForwardRef are resolved for registered types, whenever it is possible to resolve them without context information. (Essentially this is done by a call
get_type_hints
on the type containing the ForwardRef).For this a new function
resolve_types
was added which is systematically used. It basically does the same thing asattrs.resolve_types
but works for all types (attrs classes, dataclasses and type aliases). -
ForwardRefs are now registered with direct dispatch and no longer singledispatch. This is to work around #206.
Thanks for putting in work on this.
The hooks are obviously fine. The issue is that this introduces resolve_types
for dataclasses, and tbh that's out of scope for cattrs. That should be implemented in Python itself or another library that's willing to take on the maintenance burden of it.
Hmm, but what is the alternative? Not supporting dataclasses would not be very useful - at least not for me. Hooks without any type resolving before will just generate an error message.
In fact, you already do resolve dataclasses in cattr. It is the get_type_hints
call inside adapted_fields
. Even though this is enough to handle PEP563, one needs more symmetric handling of attrs and dataclasses to get all ForwardRef tests to pass. That is why I introduced resolve_types
, in the same spirit as has
and fields
in _compat.py.
One could simplify the new resolve_types
by removing the part under if not is_py39_plus:
, though. This would not really hurt as it only affects older Python versions (and only if PEP 563 is enabled).
I understand your concern regarding mantainability. However I would argue having a more symmetric way of handling dataclasses and attrs classes is cleaner and needs less maintenance effort.
For instance: One can get rid of get_type_hints
call in adapted_fields
(which probably should not really be in there) and the calls to attrs.resolve_types
scattered around cattrs (and some more clean ups).
But that would be a refactoring on its own and I wanted to keep the list of changes for this PR small.
Could it also be that in https://github.com/python-attrs/cattrs/commit/a4282724a47df8ac9f5a9c8069dc49160cae152d cattr/gen.py and cattrs/gen.py are accentially swapped?
You make a couple of good points, so let's discuss further.
(The cattr -> cattrs move isn't accidental, it's on purpose. I think all new code will be going into the cattrs namespace, and existing code will be migrating there over time. The old packages will still have aliases so no backwards compatibility break.)
The resolve_types
as you've implemented it overwrites the type
field for dataclasses. I have no idea if this is a supported use case, and that's why cattrs is careful to never do it. Hence the adapted fields approach. If in the next version Python decides to make this impossible by making this field immutable or something, I don't want this to be a cattrs problem.
On a higher level, here's the philosophy of cattrs: attrs is simply a better library than dataclasses, and one of its advantages is that the maintainers of attrs (I'm one of them) choose to implement and maintain resolve_types
. The maintainers of dataclasses choose not to - instead the interface provided is get_type_hints
, and the users get to deal with that. The answer here isn't cattrs patching up this missing support in dataclasses, it's the users of dataclasses (you) switching to attrs or putting pressure on the maintainers of dataclasses to implement this.
Now that you see where I'm coming from, you say you're forced to use dataclasses and I'm sympathetic to this use case. So let's discuss to see what we can accomplish without cattrs implementing resolve_types
for dataclasses in this form.
One of the options is you implementing resolve_types
for dataclasses in your code base. So you get to maintain it instead ;) Would this work for you? Another option would be improving some of the cattrs APIs to be more customizable, which would be useful for other things too.
Also, didn't you mention ForwardRefs can resolve themselves in newer Python versions? Presumably via get_typing_hints
?
The primary reason to implement resolve_types
for all types is that it is needed inside cattr for ForwardRef
resolution and not to provide it to the user (that would be just a side effect).
Perhaps let me first expand on what functionality is needed:
-
You are right that Python (only) provides the
get_type_hints
function for type resolution. In fact this is all we need. The thing that makes it complicated is that we cannot just callget_type_hints
on the ForwardRef and expect it to work. -
Python/
get_type_hints
, under the hood, does some complicated things to find out the target module of the ForwardRef. Some types know their module andget_type_hints
will resolve the ForwardRefs in child annotations. For instanceget_type_hints
called on a dataclass with a field of typeOptional['X']
will resolve the 'X' in the same module scope as the dataclass was defined, no matter where the call toget_type_hints
is made. -
Types that know their own module are:
ForwardRefs
withmodule
field set (only used in special cases), classes (including attrs classes & dataclasses), NewType, TypedDict, NamedTuple, and perhaps more I overlooked [1]. -
So to resolve ForwardRefs we need to call
get_type_hints
on some parent type of the ForwardRef. When this parent type knows its moduleget_type_hints
will resolve the child ForwardRefs. -
Concluding, since it is beyond the scope of a library such as cattr to know which typing.py types carry module scope and which not, the simplest solution is to just call
get_type_hints
on every type that is supposed to get (un)structured. That makes sure that all child-ForwardRefs get resolved when there is opportunity for them to get resolved.
The original idea was to concentrate all get_type_hints
business logic calls into one function which is called in the hook factory and takes care of resolving ForwardRefs and (the closely related) PEP563 delayed annotations.
So now let me try to address your specific concerns: I see several options:
- Remove overwriting of
type
fields of dataclasses inresolve_types
. You make a valid point of not overwriting thetype
field of dataclassField
. This is not good. Removing it will just introduce failing tests withConverter
. ButGenConverter
is still fine. If ForwardRefs would only be supported with GenConverter this is ok for me. - Make
resolve_types
return a value. Afterget_type_hints
do not write totype
field, but convert dataclasses to attrs classes. This is essentially the functionality that is done inadapted-fields
, and not only return a list of Attribute but put these in a attrs class and return that. (This and corresponding clean-ups could also be done after 1 and we could would not strictly need that) - The
resolve_types
does not need to be exported from cattr. -
resolve_types
could receive a better name
So if we do 1 & 3 and possibly 4, the resolve_types would have the dataclasses handling removed and not be exported. It would not generate an error when called for datacasses and work for type aliases as well.
[1]: Even though these types are not (yet) supported by cattrs, the user might still register a hook for them and expect child ForwardRefs to work.
Another option would be improving some of the cattrs APIs to be more customizable, which would be useful for other things too.
Could you elaborate on that?