iceberg-python
iceberg-python copied to clipboard
Format-versioned `Snapshot`s in light of V3 additions
Feature Request / Improvement
While thinking about https://github.com/apache/iceberg-python/issues/1971 and https://github.com/apache/iceberg-python/issues/1972, I realised that V3 introduces new fields to Snapshot - one required for V3 and the other not.
As it stands, it feels inelegant to add the V3 required field as an optional field on the Snapshot class and e.g. check within TableMetadata construction that it's present if the table is V3 (or just not do this at all). I think it might be nicer to encode that information within the typing (model), similar to the TableMetadataV3 excerpt below. https://github.com/apache/iceberg-python/blob/201057e4152e6223717b9ab77d8acea3438e26c1/pyiceberg/table/metadata.py#L552-L560
I'm therefore wondering about about "versioning" Snapshot similar to TableMetadata, so that V3 TableMetadata would contain a list of V3 Snapshots. Then, if V3 snapshot fields are present in V2 metadata, we'd get the benefit of throwing which I think is nice about PyIceberg's TableMetadata Union setup here compared to other implementations.
(I've not fleshed out the details here so not certain this is feasible but dropping an issue for now. Perhaps this has already been discussed / thought about 😄)
Hey @smaheshwar-pltr Thanks for bringing this up.
I'm therefore wondering about "versioning"
Snapshotsimilar toTableMetadata, so that V3TableMetadatawould contain a list of V3Snapshots.
The problem is that from the moment we upgrade a table from {V1,V2} to V3, the field is not there, so we still would run into deserialization issues. For simplicity, I'm leaning towards not versioning because we still would need to check if the fields are not-null, as they stay null after bumping the version to V3: https://iceberg.apache.org/spec/#row-lineage-for-upgraded-tables
Ooh thanks a lot for pointing that out @Fokko, I think the upgrade procedure would indeed make versioning complicated. Siding with you now
@smaheshwar-pltr Are you interested in adding those fields?
@smaheshwar-pltr Are you interested in adding those fields?
Happy for someone else to take a stab!
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.