V3 Tracking issue
Tracks the progress of the V3 implementation
- [ ] New DataTypes
- [x] UnknownType: Added in https://github.com/apache/iceberg-python/pull/1681
- [x] Timestamp(Tz)NanoType: Introduced in https://github.com/apache/iceberg-python/pull/1632
- [ ] VariantType: https://github.com/apache/iceberg-python/issues/1819
- [ ] Geometry/Geography: https://github.com/apache/iceberg-python/issues/1820
- [ ] Default value:
- [ ] https://github.com/apache/iceberg-python/issues/1836 / https://github.com/apache/iceberg-python/pull/1644
- [x] https://github.com/apache/iceberg-python/pull/1770
- [x] Prerequisite: https://github.com/apache/iceberg-python/pull/1805
- [x] Multi-arg transforms
- Reading of the metadata has been added in https://github.com/apache/iceberg-python/pull/1554
- Supporting multi-arg transforms still needs to happen
- [ ] Row-level lineage
- https://github.com/apache/iceberg-python/issues/1821
- [ ] Deletion vectors
- [x] Read support is being added in https://github.com/apache/iceberg-python/pull/1516
- [ ] Write support
I'd love to get my hands on working with some more new types (when the dependencies are resolved).
Could you assign me to either the VariantType or GeoTypes? (or both? haha) :)
Hey @sungwy it is first come first serve! :D Please check out the issues of the types itself. I think there are some dependencies, but curious to learn what you think.
Happy to see an issue tracking this 🚀 !
The V3 spec changes for encryption have merged in https://github.com/apache/iceberg/pull/12162. There's an open PR (edit: now merged) for the resultant changes to table metadata on the Java side here - https://github.com/apache/iceberg/pull/12927.
@Fokko, can we add encryption to the list here, so we're tracking it somewhere for PyIceberg? (Update: I put up https://github.com/apache/iceberg-python/issues/1972 for metadata read change tracking)
@Fokko
I see that read for equality deletes are not implemented yet and write for both positional/equality deletes are also not implemented.
I am aware that the community deprecates position deletes in v3 and might deprecate equality deletes as well in the future so want to ask what is the vision for pyiceberg on these topics, thanks!
@stevie9868
I am interested in Deletion vectors work if no one is working on it.
DV read is implemented in #1516. I dont think anyone has started on write support
Also, I see that read for equality deletes are not implemented yet and write for both positional/equality deletes are also not implemented.
Even with the deprecation, I think it would be nice to support reading equality deletes for feature completeness. Writing positional deletes would be nice too. Writing equality deletes is nice but we'd like to discourage writing equality deletes due to deprecation
@stevie9868 are you still planning on taking on the deletion vector work? I'm happy to contribute some cycles if you don't have any
@rambleraptor
yeah, I am planning to work on it soon. There is an ongoing work to build the DeleteFileIndex, which the DV work might depend on.
Can the current version of pyiceberg read an v3 table correctly?
This is very exciting. I see the deletion vector write support is still pending. Is this something under way or is it something the community could help with?
cc @yingjianwu98 @rambleraptor