dacite
dacite copied to clipboard
Feature proposal: add support for explicit union selection
We use dacite pretty heavily for some very complex dataclass hierarchies and the one pain point we keep running into is the potentially indeterministic Union selection. The naive approach is very good for what it is but fundamentally will never work when multiple options have the same interface. For the corresponding marshmallow schemas we use marshmallow-polyfield to explicitly define polymorphism based on another attribute so I'm hoping we could add a similar feature in dacite.
Proposed API:
Add a new explicit_unions
option to dacite.Config which is a Dict[str,callable]
for attribute resolution. This will allow for explicit union matching without any modification to your dataclasses.
For larger hierarchies, setting config options at lower levels would be much more usable. Since dataclasses specifically ignore class variables,, add a class variable for a custom dacite config for that class. Dacite would override any parent config with the current class's.
Thus, explicit union handling would look like this:
@dataclass
class Foo:
a: Union[A,B]
a_type: str
dacite_config: dacite.Config = dacite.Config(explicit_unions={"a" : lambda data: return A if data["a_type"] == "a" else B})
I wanted to open a discussion here before opening a PR as the API could be very opinionated. Thank you for a great project.
Hi @eddawley - thank you for sharing this very interesting idea.
So as I understand you are proposing 2 new features:
1.explicit_unions
2. dataclass-level configuration
Let's start with the first one. Is it possible to use Literal
in your case? You can use it in the following way:
from dataclasses import dataclass
from typing import Literal, Union
import dacite
@dataclass
class A:
t: Literal["a"]
@dataclass
class B:
t: Literal["b"]
@dataclass
class C:
u: Union[A, B]
print(dacite.from_dict(C, {"u": {"t": "a"}})) # C(u=A(t='a'))
print(dacite.from_dict(C, {"u": {"t": "b"}})) # C(u=B(t='b'))
You are correct that this is actually 2 new features. I realized that after I submitted. Sorry for any confusion.
As for using Literal
, that only works if the relationship is defined on a child's attribute. With marshmallow-polyfield
you define the relationship on an attribute(s) in the parent.
Here's a simple example explaining the difference:
@dataclass
class Person:
pet_type: Literal["cat", "dog"]
pet: Union[Cat, Dog]
@dataclass
class Cat:
name: str
greeting: str = None
@dataclass
class Dog:
name: str
greeting: str = None
vs
@dataclass
class Person:
pet: Union[Cat, Dog]
@dataclass
class Cat:
name: str
pet_type: Literal["cat"]
greeting: str = None
@dataclass
class Dog:
name: str
pet_type: Literal["dog"]
greeting: str = None
When no greeting
is supplied, there is no way for dacite to get an instantiation error in the former. Thus it will accept the first option every time.
The latter might work for some cases but things like sqlalchemy polymorphic require the former.