Union type of `Enum` and primitive type always gives primitive type
Initial Checks
- [X] I confirm that I'm using Pydantic V2
Description
e.g. Union[IntEnum, int] always gives int, enums with string values in unions with str types always give str, etc.
I tried lots of ways of defining a Union type where an Enum is one option and the Enum values' primitive type is the other option. None of these attempts worked, so I thought I'd ask here in case I'm missing something obvious or not so obvious.
i.e. the union type never uses the enum, even though when you take the option of the primitive int type out of the union it validates to it.
A Literal works instead of an enum, but Enums are documented as if they're valid for representing multiple choices, so I'm surprised that this isn't working. If I have to write a manual field validator it feels like the type annotation machinery that Pydantic uses so well is going to waste.
This also prevents 'splitting' models based on a union annotation: as shown here, Literal would let you choose a different model to handle binary numbers but an Enum will never be used so would effectively produce a single 'un-split' model.
from enum import Enum
from typing import Literal, Union
from pydantic import RootModel, TypeAdapter
class BinaryEnum(Enum):
ZERO = 0
ONE = 1
class EnumRootModel(RootModel):
root: BinaryEnum
class LiteralRootModel(RootModel):
root: Literal[0, 1]
LiteralUnion = Union[LiteralRootModel, int]
EnumUnion = Union[EnumRootModel, int]
print("Using Literal root model:")
print(TypeAdapter(list[LiteralUnion]).validate_python([1, 2]))
print()
print("Using Enum root model:")
print(TypeAdapter(list[EnumUnion]).validate_python([1, 2]))
Using Literal root model:
[LiteralRootModel(root=1), 2]
Using Enum root model:
[1, 2]
- Using a
RootModelwithint-typerootproduces the same result
I find this difference in behaviour counterintuitive to the point that I'm inclined to think it's a bug.
When I used datamodel-code-generator my memory was that you could choose between Literals and Enums interchangeably there, so this might be impacting models produced there I'm not sure.
Example Code
from enum import Enum
from typing import Union
from pydantic import TypeAdapter
class Binary(Enum):
x = 0
y = 1
union_fwd = Union[Binary, int]
union_rev = Union[int, Binary]
val = 1
enum_result = TypeAdapter(Binary).validate_python(val)
fwd_result = TypeAdapter(union_fwd).validate_python(val)
rev_result = TypeAdapter(union_rev).validate_python(val)
int_result = TypeAdapter(int).validate_python(val)
assert fwd_result == rev_result == int_result == val
assert enum_result == Binary.y
Python, Pydantic & OS Version
pydantic version: 2.1.1
pydantic-core version: 2.4.0
pydantic-core build: profile=release pgo=true mimalloc=true
install path: /home/louis/miniconda3/envs/pydanticv2/lib/python3.11/site-packages/pydantic
python version: 3.11.4 (main, Jul 5 2023, 13:45:01) [GCC 11.2.0]
platform: Linux-5.15.0-43-generic-x86_64-with-glibc2.35
optional deps. installed: ['typing-extensions']
Selected Assignee: @hramezani
I saw it noted elsewhere by @adriangb that:
v2 does multiple passes on unions. the first without coercion and if none of the options match it does another pass with coercion
Would that description maybe explain this case? Is the first pass trying to match str type, and finding a str in an enum value, so not resorting to the 2nd pass to coerce the str value to Enum?
I’m not familiar with the internals of this routine so don’t know where to look to confirm or not.
- Perhaps
_internal._std_types_schema’sget_enum_core_schema()- Maybe it needs a validator with
__get_pydantic_core_schema__(like some of the other types listed there)?
- Maybe it needs a validator with
This bug might also be phrased as “strict coercion is always used for Union of Enum and the Enum value’s type (or subclassed type)” if so.
- The “subclassed type” here meaning
intforIntEnum,strforStrEnum, but the same effect is seen when using a regularEnum. - Indeed I expect you could have a regular
Enumwith bothintandstrvalues, and a Union withintandstrwould likewise resolve to the primitive types.
I will aim to improve this in https://github.com/pydantic/pydantic-core/pull/867
(I will see if it's possible to prefer the Enum class if possible.)
This is now fixed on main, thanks @davidhewitt (I'm guessing an old fix from https://github.com/pydantic/pydantic-core/pull/867).
Maybe I understand something wrong here. But I am running into the same issue and am not getting the impression that it was fixed. I am using pydantic 2.7.4.
from enum import Enum
from typing import Union
import pydantic
print(pydantic.VERSION)
class FooEnum(Enum):
A = 1
B = 2
class BarModel(pydantic.BaseModel):
foo: Union[FooEnum, int]
bar = BarModel(foo=1)
print(bar.foo)
print(type(bar.foo))
print(bar.model_dump())
And the output is:
2.7.4
1
<class 'int'>
{'foo': 1}
Reopening, I think this was a regression with our migration of enum validators to rust!
Reopening, I think this was a regression with our migration of enum validators to rust!
Blessed are the backlog revivers :pray:
Will be trying to rewrite our enum validator in rust for v2.10 :) to fix these regressions / discrepancies with earlier versions!
Circling back here as I make changes to the union / literal / enum validators.
I think this behavior is correct - it makes sense to me that we prioritize primitives - they're more of an exact match based on our exact / strict / lax match scoring system.
See below for another example:
from pydantic import TypeAdapter
from enum import Enum
from typing import Literal
class MyEnum(str, Enum):
FOO = 'foo'
BAR = 'bar'
ta = TypeAdapter(MyEnum | Literal['foo'])
assert ta.validate_python('foo') == 'foo'
assert ta.validate_python(MyEnum.FOO) == 'foo'
assert ta.validate_python('bar') == MyEnum.BAR
Going to mark this as not planned for now.