iceberg-python icon indicating copy to clipboard operation
iceberg-python copied to clipboard

V3 Tracking issue

Open Fokko opened this issue 9 months ago • 8 comments

Tracks the progress of the V3 implementation

  • [ ] New DataTypes
    • [x] UnknownType: Added in https://github.com/apache/iceberg-python/pull/1681
    • [x] Timestamp(Tz)NanoType: Introduced in https://github.com/apache/iceberg-python/pull/1632
    • [ ] VariantType: https://github.com/apache/iceberg-python/issues/1819
    • [ ] Geometry/Geography: https://github.com/apache/iceberg-python/issues/1820
  • [ ] Default value:
    • [ ] https://github.com/apache/iceberg-python/issues/1836 / https://github.com/apache/iceberg-python/pull/1644
    • [x] https://github.com/apache/iceberg-python/pull/1770
    • [x] Prerequisite: https://github.com/apache/iceberg-python/pull/1805
  • [x] Multi-arg transforms
    • Reading of the metadata has been added in https://github.com/apache/iceberg-python/pull/1554
    • Supporting multi-arg transforms still needs to happen
  • [ ] Row-level lineage
    • https://github.com/apache/iceberg-python/issues/1821
  • [ ] Deletion vectors
    • [x] Read support is being added in https://github.com/apache/iceberg-python/pull/1516
    • [ ] Write support

Fokko avatar Mar 20 '25 11:03 Fokko

I'd love to get my hands on working with some more new types (when the dependencies are resolved).

Could you assign me to either the VariantType or GeoTypes? (or both? haha) :)

sungwy avatar Mar 21 '25 00:03 sungwy

Hey @sungwy it is first come first serve! :D Please check out the issues of the types itself. I think there are some dependencies, but curious to learn what you think.

Fokko avatar Mar 24 '25 09:03 Fokko

Happy to see an issue tracking this 🚀 !

The V3 spec changes for encryption have merged in https://github.com/apache/iceberg/pull/12162. There's an open PR (edit: now merged) for the resultant changes to table metadata on the Java side here - https://github.com/apache/iceberg/pull/12927.

@Fokko, can we add encryption to the list here, so we're tracking it somewhere for PyIceberg? (Update: I put up https://github.com/apache/iceberg-python/issues/1972 for metadata read change tracking)

smaheshwar-pltr avatar May 06 '25 16:05 smaheshwar-pltr

@Fokko

I see that read for equality deletes are not implemented yet and write for both positional/equality deletes are also not implemented.

I am aware that the community deprecates position deletes in v3 and might deprecate equality deletes as well in the future so want to ask what is the vision for pyiceberg on these topics, thanks!

yingjianwu98 avatar May 31 '25 00:05 yingjianwu98

@stevie9868

I am interested in Deletion vectors work if no one is working on it.

DV read is implemented in #1516. I dont think anyone has started on write support

Also, I see that read for equality deletes are not implemented yet and write for both positional/equality deletes are also not implemented.

Even with the deprecation, I think it would be nice to support reading equality deletes for feature completeness. Writing positional deletes would be nice too. Writing equality deletes is nice but we'd like to discourage writing equality deletes due to deprecation

kevinjqliu avatar Jun 14 '25 18:06 kevinjqliu

@stevie9868 are you still planning on taking on the deletion vector work? I'm happy to contribute some cycles if you don't have any

rambleraptor avatar Jun 30 '25 21:06 rambleraptor

@rambleraptor

yeah, I am planning to work on it soon. There is an ongoing work to build the DeleteFileIndex, which the DV work might depend on.

yingjianwu98 avatar Jul 01 '25 22:07 yingjianwu98

Can the current version of pyiceberg read an v3 table correctly?

aschreiber1 avatar Nov 14 '25 12:11 aschreiber1

This is very exciting. I see the deletion vector write support is still pending. Is this something under way or is it something the community could help with?

cc @yingjianwu98 @rambleraptor

glesperance avatar Dec 06 '25 18:12 glesperance