iceberg-python icon indicating copy to clipboard operation
iceberg-python copied to clipboard

Delete/Overwrite/Upsert with string key containing dot (.) hangs indefinitely

Open chidachu77 opened this issue 3 months ago • 2 comments

Apache Iceberg version

None

Please describe the bug 🐞

Description

Any operation that rewrites data (delete/ overwrite/ upsert) is extremely slow or unresponsive, even on very small tables. Appending rows works as expected.

Steps to Reproduce

from pyiceberg.schema import Schema
from pyiceberg.types import NestedField, StringType
import pyarrow as pa

# Create table
schema = Schema(
    NestedField(field_id=1, name="load_id", field_type=StringType(), required=True),
    NestedField(field_id=2, name="status", field_type=StringType(), required=True),
    identifier_field_ids=[1],
)

catalog.create_table(identifier="test.test_load", schema=schema)

tbl = catalog.load_table("test.test_load")
df = pa.Table.from_pylist(
    [
        {"load_id": "123.123", "status": "started"},
        {"load_id": "456.456", "status": "done"},
    ],
    schema=tbl.schema().as_arrow(),
)
tbl.append(df)

# Delete with dot in string key → hangs indefinitely
tbl.delete(delete_filter="load_id == '123.123'")

Observed Behavior

  • append: completes quickly.
  • delete with string containing dot (e.g., "123.123"): runs indefinitely or takes 10+ minutes without returning a response.
  • overwrite and upsert with string containing dot: same behavior — operation never completes or is extremely slow.

Environment

  • PyIceberg version: 0.10.0
  • Catalog: REST
  • Python version: 3.12
  • Storage: AWS S3 tables

Willingness to contribute

  • [ ] I can contribute a fix for this bug independently
  • [ ] I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • [ ] I cannot contribute a fix for this bug at this time

chidachu77 avatar Sep 24 '25 08:09 chidachu77

hmmm I added

@pytest.mark.integration
@pytest.mark.parametrize("test_catalog", CATALOGS)
def test_slow_dots(
    test_catalog: Catalog, table_schema_simple: Schema, table_name: str, database_name: str
) -> None:
    identifier = (database_name, table_name)
    test_catalog.create_namespace(database_name)
    tbl = test_catalog.create_table(identifier, table_schema_simple)
    df = pa.Table.from_pylist(
        [
            {"foo": "123.123", "bar": 1, "baz": True},
            {"foo": "456.456", "bar": 1, "baz": True},
        ],
        schema=tbl.schema().as_arrow(),
    )
    tbl.append(df)
    tbl.delete(delete_filter="foo == '123.123'")

To the test_catalog integration tests locally and am not seeing a performance hit but that doesnt test with s3 tables

jayceslesar avatar Sep 24 '25 16:09 jayceslesar

I think I'm having something similar?

2025-11-10 16:49:45,720 [INFO] __main__: ✅ Table refreshed
2025-11-10 16:49:45,720 [INFO] __main__: 📸 Current snapshot: 2148201725482720065
2025-11-10 16:49:45,720 [INFO] __main__: 🎯 Creating delete filter for ID: 123
2025-11-10 16:49:45,721 [INFO] __main__: ✅ Delete filter created: EqualTo(term=Reference(name='id'), literal=literal('123'))
2025-11-10 16:49:45,721 [INFO] __main__: 💾 Starting Iceberg transaction...
2025-11-10 16:49:45,721 [INFO] __main__: 🔄 Transaction started successfully
2025-11-10 16:49:45,721 [INFO] __main__: ⏰ Transaction start time: 2025-11-10T16:49:45.721282
2025-11-10 16:49:45,721 [INFO] __main__: 🗑️  Executing delete operation for: 123

I have a ~17million rows table that's partitioned by hour. In the partition that this should be, it's around ~400, so it should be fast, but I cancelled this after ~5min.

In S3, I saw a new Parquet file that had all the rows but the one deleted, and the old file too. I think this might be related to the transaction itself hanging?

francocalvo avatar Nov 10 '25 19:11 francocalvo