Daft
Daft copied to clipboard
error writing parquet: `metadata listed 100000 rows but only read: 100185`
Describe the bug
To Reproduce
using the same lineitem file found here
aws s3 cp s3://daft-public-data/testing_data/bad-polars-lineitem.parquet ./lineitem.parquet --no-sign-request
import daft
(daft.read_parquet('.lineitem.parquet')
.limit(100000)
.write_parquet('lineitem.parquet'))
DaftCoreException: DaftError::External Parquet file: [file://./lineitem.parquet](file:///lineitem.parquet) metadata listed 100000 rows but only read: 100185
Expected behavior able to write the file