cattrs icon indicating copy to clipboard operation
cattrs copied to clipboard

Support ForwardRef

Open aha79 opened this issue 3 years ago • 7 comments

This adds ForwardRef support to cattr (See #201).

  • ForwardRefs in classes

    @dataclass
    class Test:
        x: List["XType"]
    
    XType = int
    
  • Recursive classes

    @dataclass
    class Node:
        data: Dict
        left: Optional["Node"]
        right: Optional["Node"]
    
  • Recursive type aliases. (Note: Even with PEP 563 we need ForwardRefs here)

    # Tree-like structure
    Node = Tuple[ Dict, Optional["Node"], Optional["Node"] ]
    
    class UserClass:
        some_tree_data: Node
    
  • Recursive type aliases, across modules. (Needs explicit ForwardRef resolution, though. This is a Python limitation)

    # module.py
    Node = Tuple[ Dict, Optional["Node"], Optional["Node"] ]
    cattr.resolve_types(Node, globals(), locals())
    
    # main.py
    from module import Node
    class UserClass:
        some_tree_data: Node
    
  • Manual registration (this worked before, but is still supported.) This PR also fixes #206)

    Node = Tuple[ Dict, Optional["Node"], Optional["Node"] ]
    
    c.register_structure_hook(
        ForwardRef("Node"), lambda obj, _: c.structure(obj, Node)
    )
    c.register_unstructure_hook(
        ForwardRef("Node"), lambda obj: c.unstructure(obj, Node)
    )
    

Specifically, the PR does

  • Add hooks for ForwardRef (this equires ForwardRefs to be in resolved state, otherwise an error is generated)

  • Ensure that ForwardRef are resolved for registered types, whenever it is possible to resolve them without context information. (Essentially this is done by a call get_type_hints on the type containing the ForwardRef).

    For this a new function resolve_types was added which is systematically used. It basically does the same thing as attrs.resolve_types but works for all types (attrs classes, dataclasses and type aliases).

  • ForwardRefs are now registered with direct dispatch and no longer singledispatch. This is to work around #206.

aha79 avatar Jan 27 '22 12:01 aha79

Thanks for putting in work on this.

The hooks are obviously fine. The issue is that this introduces resolve_types for dataclasses, and tbh that's out of scope for cattrs. That should be implemented in Python itself or another library that's willing to take on the maintenance burden of it.

Tinche avatar Jan 27 '22 16:01 Tinche

Hmm, but what is the alternative? Not supporting dataclasses would not be very useful - at least not for me. Hooks without any type resolving before will just generate an error message.

In fact, you already do resolve dataclasses in cattr. It is the get_type_hints call inside adapted_fields. Even though this is enough to handle PEP563, one needs more symmetric handling of attrs and dataclasses to get all ForwardRef tests to pass. That is why I introduced resolve_types, in the same spirit as has and fieldsin _compat.py.

One could simplify the new resolve_types by removing the part under if not is_py39_plus:, though. This would not really hurt as it only affects older Python versions (and only if PEP 563 is enabled).

aha79 avatar Jan 31 '22 12:01 aha79

I understand your concern regarding mantainability. However I would argue having a more symmetric way of handling dataclasses and attrs classes is cleaner and needs less maintenance effort.

For instance: One can get rid of get_type_hints call in adapted_fields (which probably should not really be in there) and the calls to attrs.resolve_types scattered around cattrs (and some more clean ups). But that would be a refactoring on its own and I wanted to keep the list of changes for this PR small.

aha79 avatar Jan 31 '22 12:01 aha79

Could it also be that in https://github.com/python-attrs/cattrs/commit/a4282724a47df8ac9f5a9c8069dc49160cae152d cattr/gen.py and cattrs/gen.py are accentially swapped?

aha79 avatar Jan 31 '22 12:01 aha79

You make a couple of good points, so let's discuss further.

(The cattr -> cattrs move isn't accidental, it's on purpose. I think all new code will be going into the cattrs namespace, and existing code will be migrating there over time. The old packages will still have aliases so no backwards compatibility break.)

The resolve_types as you've implemented it overwrites the type field for dataclasses. I have no idea if this is a supported use case, and that's why cattrs is careful to never do it. Hence the adapted fields approach. If in the next version Python decides to make this impossible by making this field immutable or something, I don't want this to be a cattrs problem.

On a higher level, here's the philosophy of cattrs: attrs is simply a better library than dataclasses, and one of its advantages is that the maintainers of attrs (I'm one of them) choose to implement and maintain resolve_types. The maintainers of dataclasses choose not to - instead the interface provided is get_type_hints, and the users get to deal with that. The answer here isn't cattrs patching up this missing support in dataclasses, it's the users of dataclasses (you) switching to attrs or putting pressure on the maintainers of dataclasses to implement this.

Now that you see where I'm coming from, you say you're forced to use dataclasses and I'm sympathetic to this use case. So let's discuss to see what we can accomplish without cattrs implementing resolve_types for dataclasses in this form.

One of the options is you implementing resolve_types for dataclasses in your code base. So you get to maintain it instead ;) Would this work for you? Another option would be improving some of the cattrs APIs to be more customizable, which would be useful for other things too.

Also, didn't you mention ForwardRefs can resolve themselves in newer Python versions? Presumably via get_typing_hints?

Tinche avatar Jan 31 '22 18:01 Tinche

The primary reason to implement resolve_types for all types is that it is needed inside cattr for ForwardRef resolution and not to provide it to the user (that would be just a side effect).

Perhaps let me first expand on what functionality is needed:

  • You are right that Python (only) provides the get_type_hints function for type resolution. In fact this is all we need. The thing that makes it complicated is that we cannot just call get_type_hints on the ForwardRef and expect it to work.

  • Python/get_type_hints, under the hood, does some complicated things to find out the target module of the ForwardRef. Some types know their module and get_type_hints will resolve the ForwardRefs in child annotations. For instance get_type_hints called on a dataclass with a field of type Optional['X'] will resolve the 'X' in the same module scope as the dataclass was defined, no matter where the call to get_type_hints is made.

  • Types that know their own module are: ForwardRefs with module field set (only used in special cases), classes (including attrs classes & dataclasses), NewType, TypedDict, NamedTuple, and perhaps more I overlooked [1].

  • So to resolve ForwardRefs we need to call get_type_hints on some parent type of the ForwardRef. When this parent type knows its module get_type_hints will resolve the child ForwardRefs.

  • Concluding, since it is beyond the scope of a library such as cattr to know which typing.py types carry module scope and which not, the simplest solution is to just call get_type_hints on every type that is supposed to get (un)structured. That makes sure that all child-ForwardRefs get resolved when there is opportunity for them to get resolved.

The original idea was to concentrate all get_type_hints business logic calls into one function which is called in the hook factory and takes care of resolving ForwardRefs and (the closely related) PEP563 delayed annotations.

So now let me try to address your specific concerns: I see several options:

  1. Remove overwriting of type fields of dataclasses in resolve_types. You make a valid point of not overwriting the type field of dataclass Field. This is not good. Removing it will just introduce failing tests with Converter. But GenConverter is still fine. If ForwardRefs would only be supported with GenConverter this is ok for me.
  2. Make resolve_types return a value. After get_type_hints do not write to type field, but convert dataclasses to attrs classes. This is essentially the functionality that is done in adapted-fields, and not only return a list of Attribute but put these in a attrs class and return that. (This and corresponding clean-ups could also be done after 1 and we could would not strictly need that)
  3. The resolve_types does not need to be exported from cattr.
  4. resolve_types could receive a better name

So if we do 1 & 3 and possibly 4, the resolve_types would have the dataclasses handling removed and not be exported. It would not generate an error when called for datacasses and work for type aliases as well.

[1]: Even though these types are not (yet) supported by cattrs, the user might still register a hook for them and expect child ForwardRefs to work.

aha79 avatar Feb 10 '22 21:02 aha79

Another option would be improving some of the cattrs APIs to be more customizable, which would be useful for other things too.

Could you elaborate on that?

aha79 avatar Feb 10 '22 22:02 aha79