Make better union validation decisions based on `extra` behavior
For example:
class ModelA(BaseModel):
pass
class ModelB(BaseModel, extra='allow'):
pass
TypeAdapter(ModelA | ModelB).validate_python({'x': 1})
#> ModelA()
Is the current behavior, but that doesn't seem like the best decision.
See https://github.com/pydantic/pydantic-core/pull/1332 and https://github.com/pydantic/pydantic-core/pull/1334 for recent context on union validation decision improvements.
It seems to me like the solution here would be to have another counter extra_fields_set 🙈
Indeed. Honestly a good first issue for someone looking to start in rust and pydantic-core, given that we just set the model for this in the PRs linked above.
@erohmensing you might be interested in this!
Thanks for the tag @sydney-runkle! This makes sense, would be happy to take a look.
@davidhewitt is there any reason why you'd think of a counter over a bool for extra_fields_set? From my understanding the models either allow extras or don't, but maybe there's some more complicated custom config logic where you can customize the shape of allowed extras that I'm not aware of?
@erohmensing,
I think a bool would be fine for the reasons above that you mentioned. In some ways, a bool is better, because I don't think an arbitrary count of extra fields set is indicative of a better match if two models have the same extra config setting.
Also, consider:
from pydantic import BaseModel, TypeAdapter
class ModelA(BaseModel):
a: int
b: int
class ModelB(ModelA, extra='allow'):
pass
print(repr(TypeAdapter(ModelA | ModelB).validate_python({'a': 1, 'b': 1, 'x': 1})))
#> ModelA(a=1, b=1)
This is the current behavior - I think the desired behavior would be ModelB. This is a case where we'd have fields_set_count information (a tie), and could default to the extra metric as a tiebreaker. That being said, maybe exactness should take priority over our new metric?
@sydney-runkle
This is the current behavior - I think the desired behavior would be ModelB
I agree!
That being said, maybe exactness should take priority over our new metric?
do you mean in the case of like
from pydantic import BaseModel, TypeAdapter
class ModelA(BaseModel):
a: int
b: int
class ModelB(ModelA, extra='allow'):
pass
print(repr(TypeAdapter(ModelA | ModelB).validate_python({'a': 1, 'b': 1)))
should be Model A? If so, agreed - that seems to be the case and I think it would make sense for it to stay like that
@erohmensing, are you still interested in contributing here?
Is this still relevant? I'd like to take a shot at this