cattrs icon indicating copy to clipboard operation
cattrs copied to clipboard

ForwardRef and GenConverter do not seem to work together

Open aha79 opened this issue 2 years ago • 14 comments

  • cattrs version: 1.8.0 (attrs 21.2.0)
  • Python version: 3.9.7
  • Operating System: Windows 10

Description

When using recursive types (see example below), structure with GenConverter raises a StructureHandlerNotFoundError with the message "Unsupported type: ForwardRef('Y'). Register a structure hook for it" . The error does not happen with PEP 563 enabled. That means when we add 'from future import annotations' to the program and replace y: typing.Optional['Y'] with y: typing.Optional[Y] everything works as expected.

What I Did

import typing
import cattr
import attr

@attr.define(auto_attribs=True)
class X:
    a: int
    b: float
    y: typing.Optional['Y']   #with PEP563 enabled:   y: typing.Optional[Y]   (that does not raise an exception) 

@attr.define(auto_attribs=True)
class Y:
    a: typing.List[int]
    x: X


converter = cattr.GenConverter()

data = { 'a': [1,2,3], 'x': {'a':1, 'b':2.000001, 'y': {'a': [], 'x': {'a': 5, 'b': 3.1, 'y': None}}} }
d = converter.structure( data, Y )  # <---- raises StructureHandlerNotFoundError    Unsupported type: ForwardRef('Y'). Register a structure hook for it

aha79 avatar Dec 21 '21 21:12 aha79

Just FYI you don't need auto_attribs=True when using define, that's the default behavior :)

Tinche avatar Dec 21 '21 22:12 Tinche

I was bitten by this lately too. Wrap the entire type in quotes, instead of just 'Y':

import typing

import attr

import cattr


@attr.define
class X:
    a: int
    b: float
    y: "typing.Optional[Y]"


@attr.define
class Y:
    a: typing.List[int]
    x: X


converter = cattr.GenConverter()

data = {
    "a": [1, 2, 3],
    "x": {
        "a": 1,
        "b": 2.000001,
        "y": {"a": [], "x": {"a": 5, "b": 3.1, "y": None}},
    },
}
d = converter.structure(data, Y)

It has to do with how typing.get_type_hints works.

Tinche avatar Dec 21 '21 22:12 Tinche

Okay, I see. I confirm, this works.

However now I try to get this one-line recursive type

YY = typing.List[ typing.Tuple[typing.Optional[YY],int] ]

to work.

Unfortunately, this generates a "NameError" even if PEP563 is enabled (I know this is not a cattrs issue. perhaps this is not supported by PEP563).

But with PEP563 disabled, putting quotes around it completely

YY = "typing.List[ typing.Tuple[typing.Optional[YY],int] ]" 

gives a StructureHandlerNotFoundError error in structure.

data2 = [ (None, 1), (None, 2), ([(None, 11), ([],22)], 3) ]  
d2 = converter.structure( data2, YY )   # <--- raises StructureHandlerNotFoundError 

Also putting the quote somewhere in between, or around "YY" does not avoid the exception.

YY = typing.List[ typing.Tuple["typing.Optional[YY]",int] ]

Some background: I am writing a code generator which generates type-hinted code automatically. So I need a strategy to generate the type-hints which always works.

I am not really sure what the correct way to specify type "YY" is, so I cannot really tell if there is a cattr issue here or not.

However, https://stackoverflow.com/a/53845083/1739884 seems to indicate that

YY = typing.List[ typing.Tuple[typing.Optional["YY"],int] ]

should work.

aha79 avatar Dec 21 '21 22:12 aha79

That's an interesting problem.

I don't think PEP 563 is in play here directly, that PEP deals with annotations (think a type of a class field or a function argument), you're dealing with raw types. For example, if you call typing.get_type_hints on the string "typing.List[ typing.Tuple[typing.Optional[YY],int] ]" it'll fail because it doesn't know what to do with it. If this annotation was on a field in a class, it'd be easier to handle.

The typing.List[ typing.Tuple[typing.Optional["YY"],int] ] might work for static type analyzers, but cattrs cannot turn the string "YY" into the YY type - there's not enough information at runtime.

I think in order to support this use case cattrs would need to be able to handle ForwardRefs natively. Then you'd write:

YY = typing.List[typing.Tuple[typing.Optional[typing.ForwardRef("YY")], int]]

I'm not exactly sure if it's possible though. In order to resolve the forward reference, we'd need to get a reference to the globals() dictionary of the type somehow.

Note that you can hack it yourself with:

YY = typing.List[typing.Tuple[typing.Optional[typing.ForwardRef("YY")], int]]
YY.__args__[0].__args__[0].__args__ = (YY, type(None))

but you'll essentially create a recursive loop, and you won't be able to even print out the type. Also cattrs won't be able to handle it since it can't hash it, due to the circular reference.

Tinche avatar Dec 22 '21 00:12 Tinche

My preliminary conclusions after I thought about this issue:

  • Even with PEP 563 enabled there are situations where ForwardRefs (or quoted types) are needed to correctly describe a situation. The example is YY = typing.List[typing.Tuple[typing.Optional["YY"], int]] or the more famous

    Json = typing.Union[ typing.List['Json'], typing.Dict[str, 'Json'], int, float, bool, str, None ]

    Since this is a type alias PEP 563 does not apply here.

  • I learned the type resolution of ForwardRefs will only work within the same module (that is probably what you mean with that there is not enough information for catts to convert 'YY' to YY). This is a little disappointing Python limitation and makes a few things impossible. (See also the note on typing.get_type_hints). .

After thinking about this, probably the best solution is to take the cattr suggestion in the exception literally and to register a custom hook for each ForwardRef that is used. The Example looks like

YY = typing.List[ typing.Tuple[typing.Optional['YY'],int] ]

converter.register_structure_hook(typing.ForwardRef('YY'), lambda inst, cls: converter.structure(inst, YY) )

data2 = [ (None, 1), (None, 2), ([(None, 11), ([],22)], 3) ]
d2 = converter.structure( data2, YY )   # <<-- now works!
print(d2) 
# --> [(None, 1), (None, 2), ([(None, 11), ([], 22)], 3)]

Observations:

  • This is a general strategy that always works when ForwardRefs are used (i.e. quotes in type-hints)

  • One needs to do this registration after the type the ForwardRef refers to is defined (and in the same module).

  • The hook calls converter.structure, so that there are no issues with recursion (as there would be with make_*_fn )

  • It is probably impossible to do this inside of cattrs (i.e. handling of ForwardRef's), due to a Python limitation.

    (PEP 563 would be of no help here as it cannot eliminate the need for ForwardRefs in all cases, e.g. in the Json definition)

Do you agree to this?

aha79 avatar Dec 22 '21 22:12 aha79

I'm looking at the source code of ForwardRef in Python 3.9.7, and assuming it's essentially the same in all other versions of Python we support.

I see ForwardRefs can be evaluated manually before use by calling ForwardRef._evaluate(globals(), locals()) on it. Since this evaluation requires the globals and locals it can't really be done by cattrs. But if we say you need to evaluate your forward references yourself before cattrs sees them, maybe we can have a generic hook for handling ForwardRefs inside cattrs. Then you wouldn't need to register a hook for every ForwardRef, you'd just need to evaluate all your ForwardRefs.

Tinche avatar Dec 27 '21 17:12 Tinche

I am not sure if we could really evaluate the ForwardRefs, as it would probably trigger the hashing issue with recursive types. So I guess we need to keep the ForwardRefs unevaluated.

Actually we have two issues, one which I did not think about when I wrote my last comment. There may be two types of same name (say YY) in two modules. Now depending in which module the ForwardRef is created, it refers to different types. This is clear, but when we cannot evaluate the Refs we need to keep the text and find another way to resolve the ambiguity. And registering hooks for the ForwardRef may generate clashes, as same object may semantically refer to different types. One solution could be to change the ForwardRef to include module Information. So not 'YY' but 'package.module.YY'. Then ForwardRefs could be resolved automatically inside cattrs, perhaps inside a generator factory hook. Also ForwardRefs would also be unique and there would not be any clashes wihen registering hooks for ForwardRefs.

I also have an idea how to automatically add the module Information to forwardrefs but that I need to try out first. I will report in the next days.

aha79 avatar Dec 28 '21 12:12 aha79

@aha79 I think we can evaluate ForwardRefs without causing the recursive issue. An example:

import typing

from cattr import GenConverter

YY = typing.List[typing.Tuple[typing.Optional[typing.ForwardRef("YY")], int]]
YY.__args__[0].__args__[0].__args__[0]._evaluate(globals(), locals(), set())

c = GenConverter()
c.register_structure_hook_func(
    lambda t: t.__class__ is typing.ForwardRef,
    lambda v, t: c.structure(v, t.__forward_value__),
)

print(c.structure([([(None, 1)], 0)], YY))
[([(None, 1)], 0)]

Tinche avatar Dec 28 '21 14:12 Tinche

@Tinche I think this is a workable solution.

Sidenote: You do not need to explicitly construct ForwardRef, this is done automatically :-). Just write Optional['YY']

However if we accept explicitly calling ForwardRef, we could also pass the module parameter (e.g. ForwardRef('YY', module=__name__)) .

Then we can pass None to the globals and locals and it still works.

YY = typing.List[typing.Tuple[typing.Optional[typing.ForwardRef("YY", module=__name__)], int]] 
YY.__args__[0].__args__[0].__args__[0]._evaluate(None, None, set())   # <- this could be done inside of cattr

Evaluation could be done inside of cattr. No explicit evaluation by user is necessary, just the module parameter must be given!

However, there two minor drawbacks. These are:

  • (probably a Python bug) Python has the issue that it does not respect the module parameter in the caching mechanism in the typing module.

     ZZ = typing.Optional['YY']
     class Test:
         zz: ZZ
     YY = int
    
     get_type_hints(Test)    # this creates a cache entry of Optional[ForwardRef('YY')], i.e. without module parameter
    
     YY = typing.List[typing.Tuple[typing.Optional[typing.ForwardRef("YY", module=__name__)], int]]    
    
     print( YY.__args__[0].__args__[0].__args__[0].__forward_module__ )  
     #  ->  None   (correct would be the value of __name__)
    

    I created a Python bug report
    This is especially an issue if in different modules ForwardRefs with the same name are used.

  • mypy does not seem to like the module parameter of ForwardRefs

Nevertheless I think it would be a great thing to support ForwardRefs in catts. First, those which are evaluated (in the manner you described), and, second, those which have the forward__module parameter set. Not sure how this could be done with _structuring_hook, especially since the evaluation is only done once. What do you think?

aha79 avatar Jan 10 '22 13:01 aha79

This seems to do the job

c = cattr.GenConverter()

def _gen_forwardref_hook(type):
    type._evaluate(None, None, set())
    return lambda v, t: c.structure(v, t.__forward_value__)

c.register_structure_hook_factory(
    lambda t: t.__class__ is typing.ForwardRef,
    _gen_forwardref_hook )

The following tests assume the previous code is in different module module (so that we get NameErrors when we should).

  • No context given and no ForwardRef evaluation: raises NameError (this is expected)

    c = module.c
    
    ZZ = List[Tuple[Optional["ZZ"], int]]
    print(c.structure([([(None, 1)], 0)], ZZ))   
    # -> NameError('Name "ZZ" is not defined')
    
  • Explicit ForwardRef and module parameter: Works in 3.9+ (before 3.9 there was no module parameter)

    c = module.c
    
    ZZ = List[Tuple[Optional[ForwardRef("ZZ", module=__name__)], int]]
    print(c.structure([([(None, 1)], 0)], ZZ))
    # -> [([(None, 1)], 0)]
    
  • Explicit ForwardRef and module parameter mixed with plain ForwardRefs. Fails (this is unexpected, probably a Python bug)

    c = module.c
    dummy = Optional['ZZ']
    
    ZZ = List[Tuple[Optional[ForwardRef("ZZ", module=__name__)], int]]
    print(c.structure([([(None, 1)], 0)], ZZ))
    # -> NameError('Name "ZZ" is not defined')
    
  • Explicit evaluation of type alias: Works

    c = module.c
    
    ZZ = List[Tuple[Optional["ZZ"], int]]
    typing._eval_type(ZZ, globals(), locals())
    
    print(c.structure([([(None, 1)], 0)], ZZ)) 
    # -> [([(None, 1)], 0)]
    

The same results are obtained if ZZ is used as a dataclass member.

aha79 avatar Jan 10 '22 20:01 aha79

I just found a better way:

In Python 3.10 +, cattrs could evaluate ForwardRefs without user intervention.

We would just need to require that the user must use NewType (which probably is a reasonable thing for type aliases).

The Hooks now look like this

# module.py
c = cattr.GenConverter()
def _gen_forwardref_hook(type):
    try:
        type._evaluate(None, None, set())
    except NameError as e:
        raise ValueError(f"ForwardRef({type.__forward_arg__!r}) cannot be resolved.")
    if not type.__forward_evaluated__:
        raise ValueError(f"ForwardRef({type.__forward_arg__!r}) is not resolved.")
    return lambda v, t: c.structure(v, t.__forward_value__)

c.register_structure_hook_factory(
    lambda t: t.__class__ is typing.ForwardRef,
    _gen_forwardref_hook )

def _gen_newtype_hook(type):
    if type.__module__ != 'typing': # pre 3.10 did not set the __module__ properly
        globalns = sys.modules[type.__module__].__dict__
        typing._eval_type(type.__supertype__, globalns, None)
    return lambda v, t: c.structure(v, t.__supertype__)
    #return c._structure_func.dispatch(type.__supertype__)  <-- does not work

c.register_structure_hook_factory(
    lambda t: t.__class__ is typing.NewType,
    _gen_newtype_hook )

With Python 3.10+ everything works out-of-the-box! (this also works when YY is imported from different module!)

c = module.c

@dataclass
class X:
    a: int

YY = NewType('YY', List[ Tuple[ Optional[ 'YY'], X] ] )

print(c.structure([([(None, {'a':2})], {'a':1})], YY) )
# -> [([(None, X(a=2))], X(a=1))]

The nice thing of NewType is that in 3.10+ the __module__ member contains the module from where NewType was called. Previously (<= 3.9, it contained typing, so is of no help)

Nevertheless:

Python 3.9+: one can use the module parameter

YY = NewType('YY', List[ Tuple[ Optional[ ForwardRef('YY', module=__name__)], X] ] )

Python <3.9: explicit evaluation

YY = NewType('YY', List[ Tuple[ Optional[ ForwardRef('YY', module=__name__)], X] ] )
typing._eval_type(YY.__supertype__, globals(), locals()) 

aha79 avatar Jan 12 '22 23:01 aha79

Hm, interesting.

We don't support NewTypes currently (there's an open issue for that), so I don't know if I'm OK with having a NewType hook as part of this thing. I think NewTypes are potentially more important than ForwardRefs, so I cannot comment on this approach until I've looked at NewTypes in depth, which I don't have time to do now.

The first approach seemed simpler.

Tinche avatar Jan 12 '22 23:01 Tinche

@Tinche Allright. Then lets split the issue, and put the NewType handling aside. It can be added later, and, as a side-effect, improve the usability for ForwardRefs.

Should I prepare a PR for the ForwardRef handling?

aha79 avatar Jan 13 '22 19:01 aha79

Yep, that sounds like a good start.

Tinche avatar Jan 14 '22 00:01 Tinche