arrow
arrow copied to clipboard
[Python] 'pyarrow._parquet.SortingColumn' object has no attribute 'to_dict'
Describe the bug, including details regarding any error messages, version, and platform.
When a SortingColumn is present, the metadata of a ParquetFile can not be serialized with to_dict() because SortingColumn is missing this method.
import polars as pl
import pyarrow.parquet as pq
df = pl.DataFrame({"a": [1, 2], "b": [10, 11]})
fname = "tmp.parquet"
pq.write_table(
df.to_arrow(),
fname,
sorting_columns=[pq.SortingColumn(0),],
)
pqf = pq.ParquetFile(fname)
print(pqf.metadata.row_group(0).sorting_columns[0])
print(pqf.metadata.to_dict())
results in :
SortingColumn(column_index=0, descending=False, nulls_first=False)
...
File "pyarrow/_parquet.pyx", line 892, in pyarrow._parquet.FileMetaData.to_dict
File "pyarrow/_parquet.pyx", line 790, in pyarrow._parquet.RowGroupMetaData.to_dict
AttributeError: 'pyarrow._parquet.SortingColumn' object has no attribute 'to_dict'
Component(s)
Parquet, Python
@tlm365 Can you send a reply here? I don't know why doesn't this pr be not assigned :-( Maybe you can first "take" or reply here and I'd like assign this to you
take
Issue resolved by pull request 41704 https://github.com/apache/arrow/pull/41704