pandera
pandera copied to clipboard
polars.exceptions.ColumnNotFoundError when coerce=True and Optional field is missing
Describe the bug A clear and concise description of what the bug is.
- [x] I have checked that this issue has not already been reported.
- [x] I have confirmed this bug exists on the latest version of pandera.
- [ ] (optional) I have confirmed this bug exists on the main branch of pandera.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
from pandera.polars import Field # type: ignore
from pandera.polars import DataFrameModel # type: ignore
from typing import Optional
import polars as pl
class MyModel(DataFrameModel):
a: Optional[str] = Field(description="some description", nullable=True)
b: Optional[str] = Field(description="some description") # BOOM
c: Optional[str] = Field(description="some description", str_contains=".", nullable=True)
d: Optional[str] = Field(description="some description", str_contains=".") # BOOM
df = pl.DataFrame({})
schema = MyModel.to_schema()
schema.strict = True
schema.coerce = True # -> without this it works
print(schema.validate(df)) # BOOM
Exception:
(.venv) antonioalegria@shiro dojo % /Users/antonioalegria/Developer/dojo/.venv/bin/python /Users/antonioalegria/Developer/dojo/dojo/test.py
Traceback (most recent call last):
File ".../test.py", line 22, in
Expected behavior
The dataframe should've been validated.
Desktop (please complete the following information):
OS: macOS 14.6.1 Python 3.12.4 polars-lts-cpu 1.6.0 pandera 0.20.3
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.
I have the same problem. If the schema does not coerce the types, the Optional type works as expected (the DataFrame does not require that column). Otherwise, the schema requires that column.
Same issue here, enabling coerce in any column basically makes it required...
This is a duplicate of https://github.com/unionai-oss/pandera/issues/1660, should be addressed by https://github.com/unionai-oss/pandera/pull/1871. Gonna cut a new release in the next few days.