polars
polars copied to clipboard
object dtype not supported in Series.iter
Checks
-
[X] I have checked that this issue has not already been reported.
-
[X] I have confirmed this bug exists on the latest version of Polars.
Reproducible example
import polars as pl
def func(d):
return 1
df = pl.DataFrame({"col": pl.Series(["1", "2", "3"], dtype=pl.Object)})
# df.select(pl.col("col").apply(func)) # This works
df.select(pl.struct("col").apply(func)) # This doesn't
Issue description
thread '<unnamed>' panicked at 'object dtype not supported in Series.iter', /home/runner/work/polars/polars/polars/polars-core/src/series/iterator.rs:70:9
--- PyO3 is resuming a panic after fetching a PanicException from Python. ---
Python stack trace below:
Traceback (most recent call last):
File "/opt/venv/lib/python3.9/site-packages/polars/expr/expr.py", line 3821, in wrap_f
return x.apply(
File "/opt/venv/lib/python3.9/site-packages/polars/series/series.py", line 4560, in apply
self._s.apply_lambda(function, pl_return_dtype, skip_nulls)
pyo3_runtime.PanicException: object dtype not supported in Series.iter
Traceback (most recent call last):
File "/opt/venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3505, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-11-e829c382dcd2>", line 1, in <module>
df.select(pl.struct("col").apply(func))
File "/opt/venv/lib/python3.9/site-packages/polars/dataframe/frame.py", line 7301, in select
return self.lazy().select(*exprs, **named_exprs).collect(no_optimization=True)
File "/opt/venv/lib/python3.9/site-packages/polars/lazyframe/frame.py", line 1530, in collect
return wrap_df(ldf.collect())
pyo3_runtime.PanicException: Unwrapped panic from Python code
Expected behavior
Function is applied over struct without errors
Installed versions
--------Version info---------
Polars: 0.18.9
Index type: UInt32
Platform: Linux-6.4.7-arch1-1-x86_64-with-glibc2.31
Python: 3.9.16 (main, Mar 23 2023, 04:33:57)
[GCC 10.2.1 20210110]
----Optional dependencies----
adbc_driver_sqlite: <not installed>
cloudpickle: 2.2.1
connectorx: <not installed>
deltalake: <not installed>
fsspec: 2023.4.0
matplotlib: 3.7.1
numpy: 1.23.2
pandas: 1.5.3
pyarrow: 11.0.0
pydantic: 1.10.7
sqlalchemy: 1.4.47
xlsx2csv: <not installed>
xlsxwriter: <not installed>
import polars as pl
def func(d):
return 1
df = pl.DataFrame({"col": pl.Series(["1", "2", "3"], dtype=pl.Object)})
df.apply(func)
Also fails
Yeap, we don't support that for object types yet. Try to avoid objects.
Until we can support object types here, we should throw a nice error somewhere.
Please, pay attention to this. With this error, it's not possible to work with UUID columns.
Please, pay attention to this. With this error, it's not possible to work with UUID columns.
You can store your UUID as a string column.
Why does pl.col works fine with Object, but pl.struct does not? This also makes multiple column aggregation including Object column impossible.
A shorter reproduction:
import polars as pl
# runs correctly
s = pl.Series("a", [object()])
s.map_elements(lambda x: x, return_dtype=pl.Object)
# fails
s = pl.Series("a", [{"s": object()}])
s.map_elements(lambda x: x["s"], return_dtype=pl.Object)
iter_rows also happily runs on rows with object columns.