nessie
nessie copied to clipboard
[Bug]: ICEBERG_CANNOT_OPEN_SPLIT
What happened
Looks like nessie is referencing files in GCS that have been deleted either by GC or a concurrent query on the same table. This happen during backfill and as of now, the only way to fix this is to recreate the table.
[2025-11-06, 21:14:33] INFO - [base] [0m21:14:33 Database Error in model daily_equity_deposits_margin (models/data_science/daily_equity_deposits_margin.sql): source="airflow.providers.cncf.kubernetes.utils.pod_manager.PodManager"
[2025-11-06, 21:14:33] INFO - [base] TrinoExternalError(type=EXTERNAL, name=ICEBERG_CANNOT_OPEN_SPLIT, message="Error opening Iceberg split gs://xxx/public/fact_account_kpis_d252b9ee-5f55-4e12-b60b-13a8ba8841ef/data/_is_lpca=0/day=2025-10-26/20251105_235839_08692_yzwxz-383988d3-bd9b-4c8e-aee8-542d97cb511e.parquet (offset=4, length=125686177): File gs://xxx/public/fact_account_kpis_d252b9ee-5f55-4e12-b60b-13a8ba8841ef/data/_is_lpca=0/day=2025-10-26/20251105_235839_08692_yzwxz-383988d3-bd9b-4c8e-aee8-542d97cb511e.parquet not found", query_id=20251106_210829_77657_yzwxz): source="airflow.providers.cncf.kubernetes.utils.pod_manager.PodManager"
How to reproduce it
Not sure how to reproduce.
Nessie server type (docker/uber-jar/built from source) and version
ghcr.io/projectnessie/nessie:0.104.2
Client type (Ex: UI/Spark/pynessie ...) and version
trinodb/trino:476
Additional information
No response