pydantic-core icon indicating copy to clipboard operation
pydantic-core copied to clipboard

Make better union validation decisions based on `extra` behavior

Open sydney-runkle opened this issue 1 year ago • 8 comments

For example:

class ModelA(BaseModel):
    pass

class ModelB(BaseModel, extra='allow'):
    pass

TypeAdapter(ModelA | ModelB).validate_python({'x': 1})
#> ModelA()

Is the current behavior, but that doesn't seem like the best decision.

See https://github.com/pydantic/pydantic-core/pull/1332 and https://github.com/pydantic/pydantic-core/pull/1334 for recent context on union validation decision improvements.

sydney-runkle avatar Jun 19 '24 18:06 sydney-runkle

It seems to me like the solution here would be to have another counter extra_fields_set 🙈

davidhewitt avatar Jun 20 '24 10:06 davidhewitt

Indeed. Honestly a good first issue for someone looking to start in rust and pydantic-core, given that we just set the model for this in the PRs linked above.

sydney-runkle avatar Jun 20 '24 12:06 sydney-runkle

@erohmensing you might be interested in this!

sydney-runkle avatar Jun 20 '24 12:06 sydney-runkle

Thanks for the tag @sydney-runkle! This makes sense, would be happy to take a look.

@davidhewitt is there any reason why you'd think of a counter over a bool for extra_fields_set? From my understanding the models either allow extras or don't, but maybe there's some more complicated custom config logic where you can customize the shape of allowed extras that I'm not aware of?

erohmensing avatar Jun 20 '24 15:06 erohmensing

@erohmensing,

I think a bool would be fine for the reasons above that you mentioned. In some ways, a bool is better, because I don't think an arbitrary count of extra fields set is indicative of a better match if two models have the same extra config setting.

Also, consider:

from pydantic import BaseModel, TypeAdapter

class ModelA(BaseModel):
    a: int
    b: int

class ModelB(ModelA, extra='allow'):
    pass

print(repr(TypeAdapter(ModelA | ModelB).validate_python({'a': 1, 'b': 1, 'x': 1})))
#> ModelA(a=1, b=1)

This is the current behavior - I think the desired behavior would be ModelB. This is a case where we'd have fields_set_count information (a tie), and could default to the extra metric as a tiebreaker. That being said, maybe exactness should take priority over our new metric?

sydney-runkle avatar Jun 20 '24 16:06 sydney-runkle

@sydney-runkle

This is the current behavior - I think the desired behavior would be ModelB

I agree!

That being said, maybe exactness should take priority over our new metric?

do you mean in the case of like

from pydantic import BaseModel, TypeAdapter

class ModelA(BaseModel):
    a: int
    b: int

class ModelB(ModelA, extra='allow'):
    pass

print(repr(TypeAdapter(ModelA | ModelB).validate_python({'a': 1, 'b': 1)))

should be Model A? If so, agreed - that seems to be the case and I think it would make sense for it to stay like that

erohmensing avatar Jun 20 '24 17:06 erohmensing

@erohmensing, are you still interested in contributing here?

sydney-runkle avatar Aug 28 '24 17:08 sydney-runkle

Is this still relevant? I'd like to take a shot at this

Stormageddon37 avatar Oct 09 '25 10:10 Stormageddon37