iceberg-python icon indicating copy to clipboard operation
iceberg-python copied to clipboard

Add metadata tables

Open Fokko opened this issue 1 year ago • 13 comments

Feature Request / Improvement

In Iceberg Spark there are metadata tables that provide information around the table: https://iceberg.apache.org/docs/latest/spark-queries/

The most important tables are:

  • [ ] Files assigned to @Gowthami03B, PR in https://github.com/apache/iceberg-python/pull/614
  • [x] Snapshots assigned to @Fokko in https://github.com/apache/iceberg-python/pull/524/
  • [ ] History assigned to @ndrluis
  • [ ] Metadata log entries @kevinjqliu (issue in https://github.com/apache/iceberg-python/issues/594): https://github.com/apache/iceberg-python/pull/667
  • [ ] Manifests @geruh: PR in https://github.com/apache/iceberg-python/pull/717
  • [ ] Partitions assigned to @syun64 (issue in https://github.com/apache/iceberg-python/issues/24)
  • [x] References assigned to @geruh in https://github.com/apache/iceberg-python/pull/602
  • [x] Entries assigned to @Fokko in https://github.com/apache/iceberg-python/pull/551

Fokko avatar Mar 11 '24 08:03 Fokko

@Fokko I can take a stab at this!

Gowthami03B avatar Mar 11 '24 14:03 Gowthami03B

@Gowthami03B That would be great, any specific one that you have in mind?

Fokko avatar Mar 12 '24 07:03 Fokko

@Gowthami03B That would be great, any specific one that you have in mind?

Do we have a prio in mind? I jus saw your comments on https://github.com/apache/iceberg-python/issues/516, should we tackle Files first?

Gowthami03B avatar Mar 14 '24 13:03 Gowthami03B

SGTM, I'll do a first stab at the snapshots one 👍

Fokko avatar Mar 14 '24 15:03 Fokko

@Fokko, could you assign the History table to me?

I was studying the Java implementation of the History table and noticed that we need some utility functions to handle the snapshot ancestors. We have a WIP task to implement them in this PR #533. Therefore, I believe it would be good to wait for the implementation

ndrluis avatar Mar 18 '24 23:03 ndrluis

@ndrluis certainly! 👍

Fokko avatar Mar 26 '24 16:03 Fokko

Hi @Fokko could I pick up the Partitions table?

sungwy avatar Apr 10 '24 18:04 sungwy

Hey Fokko, I'll take references here ☝️

geruh avatar Apr 11 '24 19:04 geruh

@geruh Great seeing you here again. I've assigned it to you 👍

Fokko avatar Apr 11 '24 19:04 Fokko

Hey @Fokko I can try working on Manifests table if no one is assigned it? Update: Synced with @geruh offline who is already working on this item

rahil-c avatar Apr 17 '24 04:04 rahil-c

@Gowthami03B checking in if you're still interested in contributing the files table.

Fokko avatar Apr 17 '24 11:04 Fokko

@Gowthami03B checking in if you're still interested in contributing the files table.

@Fokko yes, I will be sending over a PR shortly.

Gowthami03B avatar Apr 17 '24 13:04 Gowthami03B

Can someone review this? @Fokko @HonahX @https://github.com/apache/iceberg-python/pull/614

Gowthami03B avatar Apr 18 '24 04:04 Gowthami03B

The last metadata table has been merged. Thanks everyone for all the great work and contributions. I think this issue can be closed

HonahX avatar Jul 04 '24 04:07 HonahX