Register Hooks for Generic Classes
- cattrs version: 1.10.0
- Python version: 3.7
- Operating System: Linux 5.16.1-1 OpenSUSE Thumbleweed
I want to register structure/unstructure hooks for classes which inherit typing.Generic and using a T = typing.TypeVar('T'), i.e. MyClass(typing.Generic[T]) – so that the structure/unstructure hooks are called when MyClass is used with a concrete type, for example MyClass[str].
In the code box below you can see an example in the function main1() about that. So the question is: Is there an intended correct way to register structure/unstructure hooks for Generic Classes?
I currently make use of register_structure_hook_func and check the correct condition myself. The statement MyClass[str] will generate an typing._GenericAlias instance which __origin__ arrtribute holds a reference to the class MyClass, which is the condition I check.
from typing import Generic, TypeVar
import attr
import cattr
T = TypeVar('T')
converter = cattr.GenConverter()
class MyClass(Generic[T]):
@staticmethod
def deserialize(value, cls):
print(f"deserialize {value}, {cls}")
return MyClass()
def serialize(self):
print(f"serialize {self}")
return 'foobar'
@attr.define
class MyBase:
a: MyClass[str]
def main1() -> None:
# MyClass[str] is faulty behaviour here
print("•••• direct ••••")
converter.unstructure(MyClass()) # OK
converter.structure("hello1", MyClass) # OK
print("•••• With T ••••")
converter.unstructure(MyClass[T]()) # OK
converter.structure("hello2", MyClass[T]) # OK
print("•••• With str ••••")
print(f"subclass: {isinstance(MyClass[str](), MyClass)} – {isinstance(MyClass[str], MyClass)}")
converter.unstructure(MyClass[str]()) # Probably unintended behaviour but `serialize()` is called because of above `issubclass` is True
b = converter.unstructure(MyBase(a=MyClass[str]()))
print("b:", b) # unstructured is an instance of 'MyClass' instead of string 'foobar'
try: # cattr.errors.StructureHandlerNotFoundError: Unsupported type: __main__.MyClass[str]. Register a structure hook for it.
converter.structure("hello3", MyClass[str])
except Exception as err:
print(err)
def main2() -> None:
# Everything works fine here
print("•••• direct ••••")
converter.unstructure(MyClass())
converter.structure("hello1", MyClass)
print("•••• With T ••••")
converter.unstructure(MyClass[T]())
converter.structure("hello2", MyClass[T])
print("•••• With str ••••")
converter.unstructure(MyClass[str]())
b = converter.unstructure(MyBase(a=MyClass[str]()))
print("b:", b) # correctly is 'foobar'
converter.structure("hello3", MyClass[str])
if __name__ == '__main__':
print("•••• main1 ••••")
converter.register_structure_hook(MyClass, MyClass.deserialize)
converter.register_unstructure_hook(MyClass, MyClass.serialize)
converter.register_structure_hook(MyClass[T], MyClass.deserialize)
converter.register_unstructure_hook(MyClass[T], MyClass.serialize)
main1()
print()
print("•••• main2 ••••")
converter.register_structure_hook_func(
# the default value (here `bool`) must be something so that the expression is `False`
# The object has a __origin__ attribute if it us used as `Class[TYPE]`, then __origin__ will point to `Class`. This
# test is secure enough since it is not only tested that the class has the attribute, but that it is also
# a subclass, which is the class we want to support with this converter.
lambda cls: issubclass(getattr(cls, '__origin__', bool), MyClass),
MyClass.deserialize,
)
converter.register_unstructure_hook_func(
lambda cls: issubclass(getattr(cls, '__origin__', bool), MyClass),
MyClass.serialize,
)
main2()
Output:
•••• main1 ••••
•••• direct ••••
serialize <__main__.MyClass object at 0x7f6bdb8515d0>
deserialize hello1, <class '__main__.MyClass'>
•••• With T ••••
serialize <__main__.MyClass object at 0x7f6bdb8515d0>
deserialize hello2, __main__.MyClass[~T]
•••• With str ••••
subclass: True – False
serialize <__main__.MyClass object at 0x7f6bdb8515d0>
b: {'a': <__main__.MyClass object at 0x7f6bdb8515d0>}
Unsupported type: __main__.MyClass[str]. Register a structure hook for it.
•••• main2 ••••
•••• direct ••••
serialize <__main__.MyClass object at 0x7f6bdbb97510>
deserialize hello1, <class '__main__.MyClass'>
•••• With T ••••
serialize <__main__.MyClass object at 0x7f6bdbb97510>
deserialize hello2, __main__.MyClass[~T]
•••• With str ••••
serialize <__main__.MyClass object at 0x7f6bdbb97510>
serialize <__main__.MyClass object at 0x7f6bdbb97510>
b: {'a': 'foobar'}
deserialize hello3, __main__.MyClass[str]
The reason why converter.register_structure_hook(MyClass[T], MyClass.deserialize) is probably how typing._GenericAlias defines __eq__ and __hash__, i.e two of them are only equal when both __origin__ and all args (above this is T vs. str) are equal :
class _GenericAlias(_Final, _root=True):
def __eq__(self, other):
if not isinstance(other, _GenericAlias):
return NotImplemented
if self.__origin__ != other.__origin__:
return False
if self.__origin__ is Union and other.__origin__ is Union:
return frozenset(self.__args__) == frozenset(other.__args__)
return self.__args__ == other.__args__
def __hash__(self):
if self.__origin__ is Union:
return hash((Union, frozenset(self.__args__)))
return hash((self.__origin__, self.__args__))
You got it right. register_structure_hook delegates the logic to a functools.singledispatch instance, and those don't work well with a lot of these more advanced use cases.
So then you use the register_unstructure_hook_func approach. These can be whatever you like; yours looks good.
OK very nice!
However, to make it easier for other users: Usually in cattrs, when you un-/structure a certain class and this class does not have an un-/structure hook itself, but its parent, cattrs will use the un-/structure of the parent.
cattrs could be designed in that way that it sees MyClass[WHATEVER] as a subclass of MyClass although it isn’t technically and automatically use un-/structure hook of MyClass for any MyClass[WHATEVER], i.e. that you only register MyClass and that the following work:
def main3() -> None:
print("•••• direct ••••")
converter.unstructure(MyClass())
converter.structure("hello1", MyClass)
print("•••• With str ••••")
converter.unstructure(MyClass[str]()) # Do call MyClass.serialize()
converter.unstructure(MyBase(a=MyClass[str]())) # Do call MyClass.serialize()
converter.structure("hello3", MyClass[str]) # Do call MyClass.deserialize()
if __name__ == '__main__':
converter.register_structure_hook(MyClass, MyClass.deserialize)
converter.register_unstructure_hook(MyClass, MyClass.serialize)
main3()
I do not know so much about functools.singledispatch but that sound something hard to tell it. The problem is that something like MyClass[str] and MyClass[int] are two different instances with no common ancestor – the only way I found is checking the __origin__ attribute.
Problem of this issue also described here #87
I also just hit this, with an attribute of type Foo[Bar] not being serialized with the error:
cattrs.errors.StructureHandlerNotFoundError: Unsupported type: 'Foo[Bar]'. Register a structure hook for it.
The workaround was to do this (i already had the issubclass_safe function):
def issubclass_safe(sub: Any, sup: type) -> bool:
if typing.get_origin(sub) is not None:
# is an instantiation of a generic type, so compare also the unparameterized type
if issubclass_safe(typing.get_origin(sub), sup):
return True
# check if sub is actually a class, otherwise issubclass throws
return inspect.isclass(sub) and issubclass(sub, sup)
_converter.register_structure_hook_func(
lambda t: issubclass_safe(t, Foo),
But in my opinion cattrs should use typing.get_origin(type) (same as type.__origin__) and check that against registered classes as well