polars
polars copied to clipboard
Polars cannot read DeltaBinaryPacked encoded files
Checks
- [X] I have checked that this issue has not already been reported.
- [X] I have confirmed this bug exists on the latest version of Polars.
Reproducible example
filepath = "data/1.parquet"
df = pl.scan_parquet(filepath, n_rows=200)
Log output
File "/main.py", line 17, in <module>
.collect()
File "/lib/python3.10/site-packages/polars/lazyframe/frame.py", line 1943, in collect
return wrap_df(ldf.collect())
polars.exceptions.ComputeError: Decoding Int64 "DeltaBinaryPacked"-encoded required parquet pages not yet implemented
Issue description
Polars cannot read values which are Delta Binary Packed as described here: https://parquet.apache.org/docs/file-format/data-pages/encodings/#delta-encoding-delta_binary_packed--5
Expected behavior
That polars can read parquet files with Delta Binary Packed encoded columns.
Installed versions
--------Version info---------
Polars: 0.20.16
Index type: UInt32
Platform: Linux-6.6.10-76060610-generic-x86_64-with-glibc2.35
Python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
----Optional dependencies----
adbc_driver_manager: <not installed>
cloudpickle: <not installed>
connectorx: <not installed>
deltalake: <not installed>
fastexcel: <not installed>
fsspec: 2024.3.1
gevent: <not installed>
hvplot: <not installed>
matplotlib: <not installed>
numpy: 1.26.4
openpyxl: <not installed>
pandas: 2.2.1
pyarrow: 15.0.2
pydantic: <not installed>
pyiceberg: <not installed>
pyxlsb: <not installed>
sqlalchemy: <not installed>
xlsx2csv: <not installed>
xlsxwriter: <not installed>