iceberg-rust
iceberg-rust copied to clipboard
[epic] address manifest reader feature gaps between rust and python implementations
What's the feature are you trying to implement?
See https://github.com/apache/iceberg-python/pull/2004 for the integration; pyiceberg using rust-based manifest reader
Heres the error log from make integration, grouped by error type:
https://gist.github.com/kevinjqliu/db6352f0b6d0ab8a717af67a1b71355e
- [x] Convert raw literal (bytes) to binary type
- "pyo3_runtime.PanicException: called
Result::unwrap()on anErrvalue: DataInvalid => Unable to convert raw literal (bytes) fail convert to type binary for: todo: rust avro doesn't support deserialize any bytes representation now"
- "pyo3_runtime.PanicException: called
- [x] Convert raw literal (bytes) to decimal(5,2) type
- "pyo3_runtime.PanicException: called
Result::unwrap()on anErrvalue: DataInvalid => Unable to convert raw literal (bytes) fail convert to type decimal(5,2) for: todo: rust avro doesn't support deserialize any bytes representation now"
- "pyo3_runtime.PanicException: called
- [x] partition field with special string characters,
special#string+field - [x] partition field with uuid
- [x] V3 manifests
Fail to parse format version in manifest metadata
- [ ]
filesmetadata tablelower_boundstests/integration/test_inspect_table.py::test_inspect_files[2] - AssertionError: Difference in column lower_bounds: {} != {2147483546: b's3://warehouse/default/table_metadata_files/data/00000-0-f5c93fd4-42af-481f-bcc0-140fad66f25a.parquet', 2147483545: b'\x00\x00\x00\x00\x00\x00\x00\x00'}- PR is out: https://github.com/apache/iceberg-rust/pull/1849
- [x] manifest file content after merge
tests/integration/test_writes/test_writes.py::test_merge_manifests_file_content[2] - AssertionError: assert [(2, 78), (4,...(8, 118), ...] == [(1, 49), (2,... (6, 94), ...]
- [x]
equality_idscan be optional (fixed by #1705) - [x] uuid support (fixed by #1706)
- [x] enable zstd (fixed by #1692)
Willingness to contribute
None
One left 🥳
The last one should be addressed by #1849 🥳
Looks like this can be closed out