iceberg-python icon indicating copy to clipboard operation
iceberg-python copied to clipboard

Remove `initial_change` when dealing with table updates

Open kevinjqliu opened this issue 1 year ago • 1 comments

Closes #864

Identified in #864, TableMetadata is initialized with the default Pydantic object for schema, partition_spec, and sort_order, which does not play well with table updates. Specifically, the initial_change field is an implementation detail of pyiceberg and does not play well when interacting with the REST API. Table update objects from the REST API does not understand this field. We can safely remove initial_change by modifying the logic for dealing with table updates.

kevinjqliu avatar Jul 21 '24 18:07 kevinjqliu

@HonahX do you mind taking a look at this when you get a chance?

kevinjqliu avatar Sep 11 '24 17:09 kevinjqliu

Hi @kevinjqliu I ran a few experiments and found that removing initial_change would be challenging unless we can temporarily disable Pydantic’s validators during update_table_metadata. Fortunately, I found that Pydantic’s model_construct can help bypass the validators in this context.

I’ve implemented this approach and created a draft PR: https://github.com/apache/iceberg-python/pull/1219. I’d love to hear your thoughts on it!

HonahX avatar Oct 06 '24 08:10 HonahX

Closing this in favor of #1219

kevinjqliu avatar Oct 29 '24 22:10 kevinjqliu