cattrs
cattrs copied to clipboard
structuring with `include_subclasses` does not work when the union type is a direct field of a dataclass.
- cattrs version: 23.1.2
- Python version: 3.10.7
- Operating System: Windows
Description
structuring with include_subclasses
does not work properly when the union type is a direct field of a dataclass. When it is indirectly used in the type hint of a field it works.
What I Did
@dataclass
class A:
x: int
@dataclass
class Derived(A):
d: A
def ustrat(a: Any, conv) -> Any:
return cattrs.strategies.configure_tagged_union(a, conv, tag_name="type")
converter = cattrs.Converter()
cattrs.strategies.include_subclasses(A, converter, union_strategy=ustrat)
dic = [{'x': 9, 'd': {'x': 99, 'd': {'x': 999, 'type': 'A'}, 'type': 'Derived'}, 'type': 'Derived'}]
data_out = converter.structure(dic, List[A] )
# >> data_out = [Derived(x=9, d=A(x=99))] # WRONG we expect '[Derived(x=9, d=Derived(x=99, d=A(x=999)))]'
# Note: The reverse (i.e. unstructuring) works as expected.
Now if we (incorrectly) change A
to Optional[A]
in Derived
it does indeed work. Using Optional[A] instead of A does not harm serialization, as None does never occur. It interferes with static type checking though so it is not a good workaround.
@dataclass
class Derived(A):
d: Optional[A] # correct would be 'A', this is a work around
data_out = converter.structure(dic, List[A] )
# >> data_out = [Derived(x=9, d=Derived(x=99, d=A(x=999)))] # RIGHT (but at the cost of wrong Derived.d)
Interesting. So what I think is happening is the hook for Derived
gets created before the include_subclasses
hook for A
gets installed, hence your behavior.
It's somewhat of a chicken and egg problem. The include_subclasses
hook for A needs the hook for Derived
to exist, and the hook for Derived
needs the include_subclasses
hook for A to exist.
I wonder if we can be clever and override hooks for children if their type is the parent type. So the hook for Derived.d
would use late binding, i.e. be converter.structure(obj, A)
. (Note that cattrs doesn't do this by default for performance.)
@Tinche I'm not sure if this is the same problem, but it seems to be an issue with the tagged union. It works fine if I don't use that.
Example:
>>> import functools
>>> import attrs
>>> import cattrs
>>> import cattrs.strategies
>>> @attrs.define()
... class Base:
... pass
...
>>> @attrs.define()
... class Child(Base):
... field: Base
...
>>> c = cattrs.GenConverter()
>>> cattrs.strategies.include_subclasses(Base, c)
>>> result = c.structure({"field": {"field": {}}}, Base)
>>> result
Child(field=Child(field=Base()))
>>> c = cattrs.GenConverter()
>>> cattrs.strategies.include_subclasses(Base, c, union_strategy=functools.partial(cattrs.strategies.configure_tagged_union,tag_name="exp_type"))
>>> c.unstructure(Child(field=Child(field=Base())))
{'field': {'field': {'exp_type': 'Base'}, 'exp_type': 'Child'}, 'exp_type': 'Child'}
>>> c.structure({'field': {'field': {'exp_type': 'Base'}, 'exp_type': 'Child'}, 'exp_type': 'Child'}, Base)
Child(field=Base()) # Child is lost!
I started using cattrs and suddenly one class didn't work. It seems the "workaround" making fields optional makes it work, but is not a good solution in the long run.
What I had which worked:
@attrs.define
class Base:
pass
@attrs.define
class List(Base):
items: List[Base]
@attrs.define
class Child(Base):
left: Base
right: Base
List with Child items works fine, but in Child I have to change type of left
and right
to Optional[Base] in order to have them structured as subclasses to Base.
An example with the List-class:
>>> c.unstructure(Child(left=List(items=[Base(), Child(left=List(items=[Base()]), right=Base())]),right=Base()))
{'left': {'items': [{'exp_type': 'Base'}, {'left': {'items': [{'exp_type': 'Base'}], 'exp_type': 'List'}, 'right': {'exp_type': 'Base'}, 'exp_type': 'Child'}], 'exp_type': 'List'}, 'right': {'exp_type': 'Base'}, 'exp_type': 'Child'}
>>> d = c.unstructure(Child(left=List(items=[Base(), Child(left=List(items=[Base()]), right=Base())]),right=Base()))
>>> c.structure(d, Base)
Child(left=Base(), right=Base())
Subclasses without tagged union doesn't work for List because it complains about not having any unique fields, which I find strange as well since it clearly has the unique field items: List[Base]
, but that's another problem.
@aha79 @Tinche Here's my ugly workaround which let's me skip the workaround using Optional using the example in my previous comment.
>>> import functools
>>> from typing import Dict, Type
>>> import attrs
>>> import cattrs
>>> import cattrs.gen
>>> import cattrs.strategies
>>> @attrs.define()
... class Base:
... tag_to_cls: Dict[str, Type] = {}
... def __init_subclass__(cls, **kwargs):
... super().__init_subclass__(**kwargs)
... cls.tag_to_cls.update({cls.__name__: cls})
...
>>> @attrs.define()
... class Child(Base):
... field: Base
...
>>> c = cattrs.GenConverter()
>>> cattrs.strategies.include_subclasses(Base, c, union_strategy=functools.partial(cattrs.strategies.configure_tagged_union,tag_name="exp_type"))
>>> c.structure({'field': {'field': {'exp_type': 'Base'}, 'exp_type': 'Child'}, 'exp_type': 'Child'}, Base)
Child(field=Base()) # same as in my previous comment, child is lost
>>> def structure_hook(c, o, cl): # cl = Base
... cl = Base.tag_to_cls.get(o["exp_type"], cl) # default to Base
... struct = cattrs.gen.make_dict_structure_fn(cl, c)
... return struct(o, cl)
>>> c.register_structure_hook(Base, functools.partial(structure_hook, c))
>>> c.structure({'field': {'field': {'exp_type': 'Base'}, 'exp_type': 'Child'}, 'exp_type': 'Child'}, Base)
Child(field=Child(field=Base()))
@anderssonjohan I spent some time debugging your example (phew, these are kind of hard to debug tbh).
I think it's the same issue. The reason it works if you change the field type to an Optional
is because that forces late resolution for the Child.field
hook, which is what solves the underlying issue. Definitely need to figure this out for the release after next (the next release is very close and very overdue, so I don't want to bloat it).
Looks like this was fixed by a different issue. I've added a unit test for the OP case just the same.