cattrs icon indicating copy to clipboard operation
cattrs copied to clipboard

Register Hooks for Generic Classes

Open raabf opened this issue 3 years ago • 4 comments

  • cattrs version: 1.10.0
  • Python version: 3.7
  • Operating System: Linux 5.16.1-1 OpenSUSE Thumbleweed

I want to register structure/unstructure hooks for classes which inherit typing.Generic and using a T = typing.TypeVar('T'), i.e. MyClass(typing.Generic[T]) – so that the structure/unstructure hooks are called when MyClass is used with a concrete type, for example MyClass[str]. In the code box below you can see an example in the function main1() about that. So the question is: Is there an intended correct way to register structure/unstructure hooks for Generic Classes?

I currently make use of register_structure_hook_func and check the correct condition myself. The statement MyClass[str] will generate an typing._GenericAlias instance which __origin__ arrtribute holds a reference to the class MyClass, which is the condition I check.


from typing import Generic, TypeVar

import attr
import cattr

T = TypeVar('T')

converter = cattr.GenConverter()


class MyClass(Generic[T]):

    @staticmethod
    def deserialize(value, cls):
        print(f"deserialize {value}, {cls}")
        return MyClass()

    def serialize(self):
        print(f"serialize {self}")
        return 'foobar'


@attr.define
class MyBase:
    a: MyClass[str]


def main1() -> None:
    # MyClass[str] is faulty behaviour here

    print("•••• direct ••••")
    converter.unstructure(MyClass())  # OK
    converter.structure("hello1", MyClass)  # OK

    print("•••• With T ••••")
    converter.unstructure(MyClass[T]())  # OK
    converter.structure("hello2", MyClass[T])  # OK

    print("•••• With str ••••")
    print(f"subclass: {isinstance(MyClass[str](), MyClass)} – {isinstance(MyClass[str], MyClass)}")
    converter.unstructure(MyClass[str]()) # Probably unintended behaviour but `serialize()` is called because of above `issubclass` is True

    b = converter.unstructure(MyBase(a=MyClass[str]()))
    print("b:", b)  # unstructured is an instance of 'MyClass' instead of string 'foobar'

    try:  # cattr.errors.StructureHandlerNotFoundError: Unsupported type: __main__.MyClass[str]. Register a structure hook for it.
        converter.structure("hello3", MyClass[str])
    except Exception as err:
        print(err)


def main2() -> None:
    # Everything works fine here

    print("•••• direct ••••")
    converter.unstructure(MyClass())
    converter.structure("hello1", MyClass)

    print("•••• With T ••••")
    converter.unstructure(MyClass[T]())
    converter.structure("hello2", MyClass[T])

    print("•••• With str ••••")
    converter.unstructure(MyClass[str]())

    b = converter.unstructure(MyBase(a=MyClass[str]()))
    print("b:", b)  # correctly is 'foobar'

    converter.structure("hello3", MyClass[str])


if __name__ == '__main__':
    print("•••• main1 ••••")

    converter.register_structure_hook(MyClass, MyClass.deserialize)
    converter.register_unstructure_hook(MyClass, MyClass.serialize)

    converter.register_structure_hook(MyClass[T], MyClass.deserialize)
    converter.register_unstructure_hook(MyClass[T], MyClass.serialize)

    main1()


    print()
    print("•••• main2 ••••")

    converter.register_structure_hook_func(
        # the default value (here `bool`) must be something so that the expression is `False`
        # The object has a __origin__ attribute if it us used as `Class[TYPE]`, then __origin__ will point to `Class`. This
        # test is secure enough since it is not only tested that the class has the attribute, but that it is also
        # a subclass, which is the class we want to support with this converter.
        lambda cls: issubclass(getattr(cls, '__origin__', bool), MyClass),
        MyClass.deserialize,
    )
    converter.register_unstructure_hook_func(
        lambda cls: issubclass(getattr(cls, '__origin__', bool), MyClass),
        MyClass.serialize,
    )

    main2()

Output:

•••• main1 ••••
•••• direct ••••
serialize <__main__.MyClass object at 0x7f6bdb8515d0>
deserialize hello1, <class '__main__.MyClass'>
•••• With T ••••
serialize <__main__.MyClass object at 0x7f6bdb8515d0>
deserialize hello2, __main__.MyClass[~T]
•••• With str ••••
subclass: True – False
serialize <__main__.MyClass object at 0x7f6bdb8515d0>
b: {'a': <__main__.MyClass object at 0x7f6bdb8515d0>}
Unsupported type: __main__.MyClass[str]. Register a structure hook for it.

•••• main2 ••••
•••• direct ••••
serialize <__main__.MyClass object at 0x7f6bdbb97510>
deserialize hello1, <class '__main__.MyClass'>
•••• With T ••••
serialize <__main__.MyClass object at 0x7f6bdbb97510>
deserialize hello2, __main__.MyClass[~T]
•••• With str ••••
serialize <__main__.MyClass object at 0x7f6bdbb97510>
serialize <__main__.MyClass object at 0x7f6bdbb97510>
b: {'a': 'foobar'}
deserialize hello3, __main__.MyClass[str]

The reason why converter.register_structure_hook(MyClass[T], MyClass.deserialize) is probably how typing._GenericAlias defines __eq__ and __hash__, i.e two of them are only equal when both __origin__ and all args (above this is T vs. str) are equal :

class _GenericAlias(_Final, _root=True):

    def __eq__(self, other):
        if not isinstance(other, _GenericAlias):
            return NotImplemented
        if self.__origin__ != other.__origin__:
            return False
        if self.__origin__ is Union and other.__origin__ is Union:
            return frozenset(self.__args__) == frozenset(other.__args__)
        return self.__args__ == other.__args__

    def __hash__(self):
        if self.__origin__ is Union:
            return hash((Union, frozenset(self.__args__)))
        return hash((self.__origin__, self.__args__))

raabf avatar Feb 02 '22 15:02 raabf

You got it right. register_structure_hook delegates the logic to a functools.singledispatch instance, and those don't work well with a lot of these more advanced use cases.

So then you use the register_unstructure_hook_func approach. These can be whatever you like; yours looks good.

Tinche avatar Feb 02 '22 23:02 Tinche

OK very nice!

However, to make it easier for other users: Usually in cattrs, when you un-/structure a certain class and this class does not have an un-/structure hook itself, but its parent, cattrs will use the un-/structure of the parent.

cattrs could be designed in that way that it sees MyClass[WHATEVER] as a subclass of MyClass although it isn’t technically and automatically use un-/structure hook of MyClass for any MyClass[WHATEVER], i.e. that you only register MyClass and that the following work:

def main3() -> None:

    print("•••• direct ••••")
    converter.unstructure(MyClass())
    converter.structure("hello1", MyClass)

    print("•••• With str ••••")
    converter.unstructure(MyClass[str]()) # Do call MyClass.serialize()

    converter.unstructure(MyBase(a=MyClass[str]())) # Do call MyClass.serialize()

    converter.structure("hello3", MyClass[str]) # Do call MyClass.deserialize()

if __name__ == '__main__':

    converter.register_structure_hook(MyClass, MyClass.deserialize)
    converter.register_unstructure_hook(MyClass, MyClass.serialize)

    main3()

I do not know so much about functools.singledispatch but that sound something hard to tell it. The problem is that something like MyClass[str] and MyClass[int] are two different instances with no common ancestor – the only way I found is checking the __origin__ attribute.

raabf avatar Feb 04 '22 17:02 raabf

Problem of this issue also described here #87

raabf avatar Feb 08 '22 11:02 raabf

I also just hit this, with an attribute of type Foo[Bar] not being serialized with the error:

cattrs.errors.StructureHandlerNotFoundError: Unsupported type: 'Foo[Bar]'. Register a structure hook for it.

The workaround was to do this (i already had the issubclass_safe function):

def issubclass_safe(sub: Any, sup: type) -> bool:
    if typing.get_origin(sub) is not None:
        # is an instantiation of a generic type, so compare also the unparameterized type
        if issubclass_safe(typing.get_origin(sub), sup):
            return True
    # check if sub is actually a class, otherwise issubclass throws
    return inspect.isclass(sub) and issubclass(sub, sup)

_converter.register_structure_hook_func(
        lambda t: issubclass_safe(t, Foo),

But in my opinion cattrs should use typing.get_origin(type) (same as type.__origin__) and check that against registered classes as well

phiresky avatar Jun 10 '22 12:06 phiresky