iceberg-python icon indicating copy to clipboard operation
iceberg-python copied to clipboard

Possible read-after-write consistency issue with multiple schema migration steps in Iceberg tables on AWS Glue + S3

Open din14970 opened this issue 2 months ago • 1 comments

Apache Iceberg version

Pyiceberg 0.10.0 Pyiceberg-core 0.6.0

Please describe the bug 🐞

This may be a hard one to pin down but I noticed that multiple schema migration steps executed sequentially in the same update_schema context sometimes result in Exceptions like column name not found when using Iceberg tables on AWS Glue. An example:

with table.update_schema() as update:
    update.rename_column("some_column", "renamed_column")
    update.move_first("renamed_column")  # this sometimes fails with an error
                                         # that renamed column doesn't exist

I have not noticed it with other back-ends like SQLite, leading me to believe it is a Glue issue specifically where a write may not yet be reflected by the time of the next operation.

Willingness to contribute

  • [ ] I can contribute a fix for this bug independently
  • [x] I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • [ ] I cannot contribute a fix for this bug at this time

din14970 avatar Oct 10 '25 09:10 din14970

hmmm we should definitely see if we can reproduce in CI via adding to https://github.com/apache/iceberg-python/pull/2371 (which was just merged)

Ideally we also set up integration tests for glue there as well

jayceslesar avatar Oct 10 '25 15:10 jayceslesar