cattrs icon indicating copy to clipboard operation
cattrs copied to clipboard

structuring with `include_subclasses` does not work when the union type is a direct field of a dataclass.

Open aha79 opened this issue 1 year ago • 4 comments

  • cattrs version: 23.1.2
  • Python version: 3.10.7
  • Operating System: Windows

Description

structuring with include_subclasses does not work properly when the union type is a direct field of a dataclass. When it is indirectly used in the type hint of a field it works.

What I Did

@dataclass
class A:
    x: int

@dataclass
class Derived(A):
    d: A

def ustrat(a: Any, conv) -> Any:
    return cattrs.strategies.configure_tagged_union(a, conv, tag_name="type")

converter = cattrs.Converter()
cattrs.strategies.include_subclasses(A, converter, union_strategy=ustrat)

dic = [{'x': 9, 'd': {'x': 99, 'd': {'x': 999, 'type': 'A'}, 'type': 'Derived'}, 'type': 'Derived'}]

data_out = converter.structure(dic, List[A] )

# >> data_out  = [Derived(x=9, d=A(x=99))]       # WRONG we expect  '[Derived(x=9, d=Derived(x=99, d=A(x=999)))]'
# Note: The reverse (i.e. unstructuring) works as expected.

Now if we (incorrectly) change A to Optional[A] in Derived it does indeed work. Using Optional[A] instead of A does not harm serialization, as None does never occur. It interferes with static type checking though so it is not a good workaround.

@dataclass
class Derived(A):
    d: Optional[A]      # correct would be 'A', this is a work around

data_out = converter.structure(dic, List[A] )

# >> data_out  = [Derived(x=9, d=Derived(x=99, d=A(x=999)))]       # RIGHT (but at the cost of wrong  Derived.d)

aha79 avatar Sep 22 '23 15:09 aha79

Interesting. So what I think is happening is the hook for Derived gets created before the include_subclasses hook for A gets installed, hence your behavior.

It's somewhat of a chicken and egg problem. The include_subclasses hook for A needs the hook for Derived to exist, and the hook for Derived needs the include_subclasses hook for A to exist.

I wonder if we can be clever and override hooks for children if their type is the parent type. So the hook for Derived.d would use late binding, i.e. be converter.structure(obj, A). (Note that cattrs doesn't do this by default for performance.)

Tinche avatar Sep 22 '23 17:09 Tinche

@Tinche I'm not sure if this is the same problem, but it seems to be an issue with the tagged union. It works fine if I don't use that.

Example:

>>> import functools
>>> import attrs
>>> import cattrs
>>> import cattrs.strategies
>>> @attrs.define()
... class Base:
...     pass
...
>>> @attrs.define()
... class Child(Base):
...     field: Base
...
>>> c = cattrs.GenConverter()
>>> cattrs.strategies.include_subclasses(Base, c)
>>> result = c.structure({"field": {"field": {}}}, Base)
>>> result
Child(field=Child(field=Base()))
>>> c = cattrs.GenConverter()
>>> cattrs.strategies.include_subclasses(Base, c, union_strategy=functools.partial(cattrs.strategies.configure_tagged_union,tag_name="exp_type"))
>>> c.unstructure(Child(field=Child(field=Base())))
{'field': {'field': {'exp_type': 'Base'}, 'exp_type': 'Child'}, 'exp_type': 'Child'}
>>> c.structure({'field': {'field': {'exp_type': 'Base'}, 'exp_type': 'Child'}, 'exp_type': 'Child'}, Base)
Child(field=Base()) # Child is lost!

I started using cattrs and suddenly one class didn't work. It seems the "workaround" making fields optional makes it work, but is not a good solution in the long run.

What I had which worked:

@attrs.define
class Base:
    pass

@attrs.define
class List(Base):
    items: List[Base]

@attrs.define
class Child(Base):
    left: Base
    right: Base

List with Child items works fine, but in Child I have to change type of left and right to Optional[Base] in order to have them structured as subclasses to Base.

An example with the List-class:

>>> c.unstructure(Child(left=List(items=[Base(), Child(left=List(items=[Base()]), right=Base())]),right=Base()))
{'left': {'items': [{'exp_type': 'Base'}, {'left': {'items': [{'exp_type': 'Base'}], 'exp_type': 'List'}, 'right': {'exp_type': 'Base'}, 'exp_type': 'Child'}], 'exp_type': 'List'}, 'right': {'exp_type': 'Base'}, 'exp_type': 'Child'}
>>> d = c.unstructure(Child(left=List(items=[Base(), Child(left=List(items=[Base()]), right=Base())]),right=Base()))
>>> c.structure(d, Base)
Child(left=Base(), right=Base())

Subclasses without tagged union doesn't work for List because it complains about not having any unique fields, which I find strange as well since it clearly has the unique field items: List[Base], but that's another problem.

anderssonjohan avatar Oct 16 '23 14:10 anderssonjohan

@aha79 @Tinche Here's my ugly workaround which let's me skip the workaround using Optional using the example in my previous comment.

>>> import functools
>>> from typing import Dict, Type
>>> import attrs
>>> import cattrs
>>> import cattrs.gen
>>> import cattrs.strategies
>>> @attrs.define()
... class Base:
...     tag_to_cls: Dict[str, Type] = {}
...     def __init_subclass__(cls, **kwargs):
...         super().__init_subclass__(**kwargs)
...         cls.tag_to_cls.update({cls.__name__: cls})
...
>>> @attrs.define()
... class Child(Base):
...     field: Base
...
>>> c = cattrs.GenConverter()
>>> cattrs.strategies.include_subclasses(Base, c, union_strategy=functools.partial(cattrs.strategies.configure_tagged_union,tag_name="exp_type"))
>>> c.structure({'field': {'field': {'exp_type': 'Base'}, 'exp_type': 'Child'}, 'exp_type': 'Child'}, Base)
Child(field=Base())  # same as in my previous comment, child is lost
>>> def structure_hook(c, o, cl):  # cl = Base
...     cl = Base.tag_to_cls.get(o["exp_type"], cl)  # default to Base
...     struct = cattrs.gen.make_dict_structure_fn(cl, c)
...     return struct(o, cl)
>>> c.register_structure_hook(Base, functools.partial(structure_hook, c))
>>> c.structure({'field': {'field': {'exp_type': 'Base'}, 'exp_type': 'Child'}, 'exp_type': 'Child'}, Base)
Child(field=Child(field=Base()))

anderssonjohan avatar Oct 16 '23 18:10 anderssonjohan

@anderssonjohan I spent some time debugging your example (phew, these are kind of hard to debug tbh).

I think it's the same issue. The reason it works if you change the field type to an Optional is because that forces late resolution for the Child.field hook, which is what solves the underlying issue. Definitely need to figure this out for the release after next (the next release is very close and very overdue, so I don't want to bloat it).

Tinche avatar Oct 17 '23 17:10 Tinche

Looks like this was fixed by a different issue. I've added a unit test for the OP case just the same.

Tinche avatar Jun 18 '24 15:06 Tinche