iceberg-python icon indicating copy to clipboard operation
iceberg-python copied to clipboard

Schema Evolution with `StructType` via `update_schema()` Fails

Open mukul-mpac opened this issue 2 months ago • 11 comments

Apache Iceberg version

0.10.0 (latest release)

Please describe the bug 🐞

Environment / Setup Details

  • PyIceberg version: latest
  • Catalog: AWS Glue
  • Dependencies: includes pyarrow
  • Python version: (please fill this in, e.g. 3.12.10)

The Problem

Given an existing table test with schema:

Schema(
    NestedField(1, "id", StringType(), required=True),
    NestedField(2, "name", StringType(), required=False),
    NestedField(3, "roll_number", IntegerType(), required=True),
)

I attempt to evolve the schema after table creation by adding a new column address of type StructType:

StructType(
    NestedField(4, "street", StringType(), required=False),
    NestedField(5, "city", StringType(), required=False),
    NestedField(6, "state", StringType(), required=False),
    NestedField(7, "zip", IntegerType(), required=False),
)

Using the update_schema() context manager and its add_column(...) method to add this StructType field results in a BadRequestError:

pyiceberg.exceptions.BadRequestError: InvalidInputException: Cannot parse to an integer value: id: 5.0

What should happen:

  • The new StructType field should be added without errors.
  • You should be able to evolve a schema to include nested/struct types via update_schema() just as you can at table creation.
  • I remember this working up till last Thursday (18th September 2025)

What is actually happening:

  • Adding a StructType via update_schema() throws InvalidInputException: Cannot parse to an integer value: id: 5.0.
  • The error indicates something is trying to parse “5.0” (a float) as an integer, presumably where a field-id or column ID is expected to be an integer.

Full traceback

Traceback (most recent call last):
  File "/Users/mukul/Documents/extra/iceberg/iceberg.py", line 318, in <module>
    with table.update_schema() as updater:
         ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mukul/Documents/extra/iceberg/venv/lib/python3.12/site-packages/pyiceberg/table/update/__init__.py", line 76, in __exit__
    self.commit()
  File "/Users/mukul/Documents/extra/iceberg/venv/lib/python3.12/site-packages/pyiceberg/table/update/__init__.py", line 72, in commit
    self._transaction._apply(*self._commit())
  File "/Users/mukul/Documents/extra/iceberg/venv/lib/python3.12/site-packages/pyiceberg/table/__init__.py", line 295, in _apply
    self.commit_transaction()
  File "/Users/mukul/Documents/extra/iceberg/venv/lib/python3.12/site-packages/pyiceberg/table/__init__.py", line 936, in commit_transaction
    self._table._do_commit(  # pylint: disable=W0212
  File "/Users/mukul/Documents/extra/iceberg/venv/lib/python3.12/site-packages/pyiceberg/table/__init__.py", line 1458, in _do_commit
    response = self.catalog.commit_table(self, requirements, updates)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mukul/Documents/extra/iceberg/venv/lib/python3.12/site-packages/tenacity/__init__.py", line 338, in wrapped_f
    return copy(f, *args, **kw)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/mukul/Documents/extra/iceberg/venv/lib/python3.12/site-packages/tenacity/__init__.py", line 477, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mukul/Documents/extra/iceberg/venv/lib/python3.12/site-packages/tenacity/__init__.py", line 378, in iter
    result = action(retry_state=retry_state)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mukul/Documents/extra/iceberg/venv/lib/python3.12/site-packages/tenacity/__init__.py", line 400, in <lambda>
    self._add_action_func(lambda rs: rs.outcome.result())
                                     ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.10_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.10_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/Users/mukul/Documents/extra/iceberg/venv/lib/python3.12/site-packages/tenacity/__init__.py", line 480, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^
  File "/Users/mukul/Documents/extra/iceberg/venv/lib/python3.12/site-packages/pyiceberg/catalog/rest/__init__.py", line 722, in commit_table
    _handle_non_200_response(
  File "/Users/mukul/Documents/extra/iceberg/venv/lib/python3.12/site-packages/pyiceberg/catalog/rest/response.py", line 111, in _handle_non_200_response
    raise exception(response) from exc
pyiceberg.exceptions.BadRequestError: InvalidInputException: Cannot parse to an integer value: id: 5.0

Steps to Reproduce

  1. Create a table with the original schema:
schema = Schema(
    NestedField(1, "id", StringType(), required=True),
    NestedField(2, "name", StringType(), required=False),
    NestedField(3, "roll_number", IntegerType(), required=True),
)

table = catalog.create_table(
    identifier=table_id,
    schema=schema,
)
  1. Load the table and attempt schema evolution:
table = catalog.load_table(table_id)
with table.update_schema() as updater:
    updater.add_column(
        path="address",
        field_type=StructType(
            NestedField(4, "street", StringType(), required=False),
            NestedField(5, "city", StringType(), required=False),
            NestedField(6, "state", StringType(), required=False),
            NestedField(7, "zip", IntegerType(), required=False),
        ),
        required=False,
    )
  1. Observe the error above.

Additional Observations

  • The error only occurs when using update_schema() / schema-evolution after the table has been created.
  • Creating the table with the StructType already included does not cause this error.
  • Also, if the StructType field already exists (from creation) and then you try to add a new integer column (simple type) using update_schema(), you encounter a similar error.

Suggested Investigation / Possible Cause

  • The error message Cannot parse to an integer value: id: 5.0 suggests something is wrongly computing a field or column ID as a float (5.0 instead of integer 5).
  • Perhaps the incremental assignment of new field IDs in schema evolution is mishandled when adding nested/struct types.
  • Possible bug in the serialization or metadata packaging step, or in how nested field IDs are validated / sent to the catalog (Glue/REST) interface.

Willingness to contribute

  • [ ] I can contribute a fix for this bug independently
  • [x] I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • [ ] I cannot contribute a fix for this bug at this time

mukul-mpac avatar Sep 23 '25 21:09 mukul-mpac

hey @mukul-mpac thanks for reporting this issue. i am not able to reproduce the issue

heres what i tried

from pyiceberg.catalog import load_catalog
from pyiceberg.schema import Schema
from pyiceberg.types import IntegerType, NestedField, StringType, StructType

catalog = load_catalog("default", type="in-memory")

catalog.create_namespace_if_not_exists("default")
schema = Schema(
    NestedField(1, "id", StringType(), required=True),
    NestedField(2, "name", StringType(), required=False),
    NestedField(3, "roll_number", IntegerType(), required=True),
)

table = catalog.create_table(
    identifier="default.test",
    schema=schema,
)

with table.update_schema() as updater:
    updater.add_column(
        path="address",
        field_type=StructType(
            NestedField(4, "street", StringType(), required=False),
            NestedField(5, "city", StringType(), required=False),
            NestedField(6, "state", StringType(), required=False),
            NestedField(7, "zip", IntegerType(), required=False),
        ),
        required=False,
    )

table.schema()
>>> table.schema()
Schema(NestedField(field_id=1, name='id', field_type=StringType(), required=True), NestedField(field_id=2, name='name', field_type=StringType(), required=False), NestedField(field_id=3, name='roll_number', field_type=IntegerType(), required=True), NestedField(field_id=4, name='address', field_type=StructType(fields=(NestedField(field_id=5, name='street', field_type=StringType(), required=False), NestedField(field_id=6, name='city', field_type=StringType(), required=False), NestedField(field_id=7, name='state', field_type=StringType(), required=False), NestedField(field_id=8, name='zip', field_type=IntegerType(), required=False),)), required=False), schema_id=1, identifier_field_ids=[])

kevinjqliu avatar Sep 24 '25 01:09 kevinjqliu

File "/Users/mukul/Documents/extra/iceberg/venv/lib/python3.12/site-packages/pyiceberg/catalog/rest/response.py", line 111, in _handle_non_200_response raise exception(response) from exc pyiceberg.exceptions.BadRequestError: InvalidInputException: Cannot parse to an integer value: id: 5.0

the error message is coming from the catalog response, which is aws glue in this case

kevinjqliu avatar Sep 24 '25 01:09 kevinjqliu

@kevinjqliu Thank you for the response,

I have started a discussion in aws forums based on your intel (here).

Is there any other steps you would recommend to solve this issue? I expect this to be a major blocker since a lot of developers must be utilizing AWS Glue REST API Catalog.

mukul-mpac avatar Sep 24 '25 16:09 mukul-mpac

Hey there, I think the API should be invoked as:

with table.update_schema() as updater:
    updater.add_column(("address", "street"), StringType(), required=False),
...
)

However, after a quick test, this still causes an issue. In this case, we should inject a StructType. Thoughts?

Fokko avatar Sep 25 '25 21:09 Fokko

@Fokko

with table.update_schema() as updater:
    updater.add_column(("address", "street"), StringType(), required=False),
...
)

The above works if address is already present as a StructType, it does not solve my current issue but thank you for your reply.

mukul-mpac avatar Oct 06 '25 13:10 mukul-mpac

Ran into this and after doing some testing I believe the problem is on the AWS side. Just commenting here because it may have more visibility than on the AWS forum. I am currently working with AWS support on the matter.

This is pretty easily reproducible; just try to update schema with a non-primitive new column (e.g. list or struct). For example:

catalog = load_catalog(
    'default',
    type='rest',
    warehouse=f'{account}:s3tablescatalog/{bucket}',
    uri=f'https://glue.{region}.amazonaws.com/iceberg',
    **{
        'rest.sigv4-enabled': 'true',
        'rest.signing-name': 'glue',
        'rest.signing-region': region,
    }
)
catalog.create_namespace('scott_test')

# create table: OK
initial_schema = pa.schema([
    pa.field("a", pa.string(), nullable=True),
    pa.field("b", pa.string(), nullable=True),
])
catalog.create_table('scott_test.element_id_bug', initial_schema)

# update table w/ primitive type: OK
table = catalog.load_table('scott_test.element_id_bug')
update_schema = pa.schema([
    pa.field("c", pa.string(), nullable=True),
])
with table.update_schema() as update:
    update.union_by_name(update_schema)

# update table with list type: FAIL
table = catalog.load_table('scott_test.element_id_bug')
update_schema = pa.schema([
    pa.field("d", pa.list_(pa.string()), nullable=True),
])
with table.update_schema() as update:
    update.union_by_name(update_schema)

This last operation throws an exception:

BadRequestError: InvalidInputException: Cannot parse to an integer value: element-id: 5.0

I captured the rest payload and confirmed that element id 5 is being sent as an integer:

{
  "identifier": {
    "namespace": [
      "scott_test"
    ],
    "name": "element_id_bug"
  },
  "requirements": [
    {
      "type": "assert-current-schema-id",
      "current-schema-id": 1
    },
    {
      "type": "assert-table-uuid",
      "uuid": "REDACTED"
    }
  ],
  "updates": [
    {
      "action": "add-schema",
      "schema": {
        "type": "struct",
        "fields": [
          {
            "id": 1,
            "name": "a",
            "type": "string",
            "required": false
          },
          {
            "id": 2,
            "name": "b",
            "type": "string",
            "required": false
          },
          {
            "id": 3,
            "name": "c",
            "type": "string",
            "required": false
          },
          {
            "id": 4,
            "name": "d",
            "type": {
              "type": "list",
              "element-id": 5,
              "element": "string",
              "element-required": false
            },
            "required": false
          }
        ],
        "schema-id": 2,
        "identifier-field-ids": []
      },
      "last-column-id": 5
    },
    {
      "action": "set-current-schema",
      "schema-id": -1
    }
  ]
}

So this pretty clearly seems like an AWS issue.

I just tried another combination and this also triggers it:

# create table with list type: OK
initial_schema = pa.schema([
    pa.field("a", pa.string(), nullable=True),
    pa.field("b", pa.list_(pa.string()), nullable=True),
])
catalog.create_table('scott_test.element_id_bug', initial_schema)

# update table with primitive: FAIL
table = catalog.load_table('scott_test.element_id_bug')
update_schema = pa.schema([
    pa.field("c", pa.string(), nullable=True),
])
with table.update_schema() as update:
    update.union_by_name(update_schema)

... which somewhat makes sense because the REST payload above seems to contain all fields in the commit_table() request. But I can't imagine how it's anything but an AWS problem.

srstrickland avatar Oct 10 '25 19:10 srstrickland

Update:

We're currently in process of fixing this with the AWS. The support team has successfully reproduced the error and passed it on to the Glue service team.

mukul-mpac avatar Oct 14 '25 14:10 mukul-mpac

I have an update from the AWS team - they have deployed a fix in the eu-west-1 region, and we’ve confirmed that it works correctly now.

jakuborlowski avatar Nov 04 '25 14:11 jakuborlowski

thanks for the update @jakuborlowski

kevinjqliu avatar Nov 04 '25 18:11 kevinjqliu

Another update from the AWS team:

The fix should also be in us-east-1 region along with `eu-west-1. Slowly to be rolled out in the other regions!

mukul-mpac avatar Nov 06 '25 15:11 mukul-mpac

Update:

Great news, AWS has rolled out the fix.

I have tested the broken functionalities as original described in the issue and it works well now.

For reference I am in region ca-central-1

mukul-mpac avatar Nov 11 '25 15:11 mukul-mpac