polars
polars copied to clipboard
Schema error when vstack null typed colum with typed colum
Checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the latest version of Polars.
Reproducible example
import polars as pl
null_typed = pl.DataFrame({'str_col': [None], 'f64_col': None})
typed = pl.DataFrame({'str_col': ['Hello'], 'f64_col': 4.2})
stacked = null_typed.vstack(typed)
Log output
Traceback (most recent call last):
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-143-9da1bcb123b1>", line 1, in <module>
stacked = null_typed.vstack(typed)
^^^^^^^^^^^^^^^^^^^^^^^^
File "...\venv\Lib\site-packages\polars\dataframe\frame.py", line 6504, in vstack
return self._from_pydf(self._df.vstack(other._df))
^^^^^^^^^^^^^^^^^^^^^^^^^^
polars.exceptions.SchemaError: cannot extend/append Null with Utf8
Issue description
This issue is related with closed issue #11824 and feat #12771 merged in Python Polars 0.19.18.
So far, I solved this issue with exception handling:
try:
stacked_ok = null_typed.vstack(typed)
except pl.exceptions.SchemaError:
taxo_df = typed.vstack(null_typed)
Expected behavior
I expect it should be possible to vstack the two dataframe, modifying the schema of the initial null typed dataframe on the fly.
stacked_ok = null_typed.vstack(typed) shape: (2, 2) ┌─────────┬─────────┐ │ str_col ┆ f64_col │ │ --- ┆ --- │ │ str ┆ f64 │ ╞═════════╪═════════╡ │ Hello ┆ 4.2 │ │ null ┆ null │ └─────────┴─────────┘
Installed versions
--------Version info---------
Polars: 0.20.2
Index type: UInt32
Platform: Windows-10-10.0.19045-SP0
Python: 3.11.7 (tags/v3.11.7:fa7a6f2, Dec 4 2023, 19:24:49) [MSC v.1937 64 bit (AMD64)]
----Optional dependencies----
adbc_driver_manager: <not installed>
cloudpickle: <not installed>
connectorx: <not installed>
deltalake: <not installed>
fsspec: <not installed>
gevent: <not installed>
matplotlib: 3.8.2
numpy: 1.25.2
openpyxl: <not installed>
pandas: 2.1.4
pyarrow: 12.0.1
pydantic: 2.5.3
pyiceberg: <not installed>
pyxlsb: <not installed>
sqlalchemy: <not installed>
xlsx2csv: <not installed>
xlsxwriter: <not installed>