Add support for BigLake metastore in Iceberg REST catalog
Description
Add GOOGLE to iceberg.rest-catalog.security config property for supporting BigLake metastore.
Also, this PR adds iceberg.rest-catalog.google-project-id config property.
- The rate limitation on BigLake metastore
- Iceberg REST Catalog read requests per minute: 120 -> 360 -> 500 now
- Iceberg REST Catalog namespace and table write requests per minute: 60 -> 180 -> 300 now
Caused by: org.apache.iceberg.exceptions.RESTException: Unable to process: Quota exceeded for quota metric 'Iceberg REST Catalog read requests' and limit 'Iceberg REST Catalog read requests per minute' of service 'biglake.googleapis.com' for consumer 'project_number:505084745097'.
at org.apache.iceberg.rest.ErrorHandlers$DefaultErrorHandler.accept(ErrorHandlers.java:250)
at org.apache.iceberg.rest.ErrorHandlers$TableErrorHandler.accept(ErrorHandlers.java:124)
at org.apache.iceberg.rest.ErrorHandlers$TableErrorHandler.accept(ErrorHandlers.java:108)
at org.apache.iceberg.rest.HTTPClient.throwFailure(HTTPClient.java:240)
at org.apache.iceberg.rest.HTTPClient.execute(HTTPClient.java:336)
at org.apache.iceberg.rest.HTTPClient.execute(HTTPClient.java:297)
at org.apache.iceberg.rest.BaseHTTPClient.get(BaseHTTPClient.java:77)
at org.apache.iceberg.rest.RESTSessionCatalog.loadInternal(RESTSessionCatalog.java:375)
Relates to https://cloud.google.com/bigquery/docs/blms-rest-catalog Supersedes #26054
Release notes
## Iceberg
* Add support for BigLake metastore in Iceberg REST catalog. ({issue}`26219`)
This looks fantastic! Thank you for writing this!
Excited to see this get merged
I guess BigLake metastore created a new metadata file in the background:
Error: Errors:
Error: TestIcebergBigLakeMetastoreConnectorSmokeTest>BaseIcebergConnectorSmokeTest.testRegisterTableWithComments:268->AbstractTestQueryFramework.assertUpdate:411->AbstractTestQueryFramework.assertUpdate:416 »
QueryFailed More than one latest metadata file found at location: gs://trino-ci-test/test_iceberg_biglake_gk2j0xnz67/test_register_table_with_comments_xyo3k42eom/metadata, latest metadata files are [
gs://trino-ci-test/test_iceberg_biglake_gk2j0xnz67/test_register_table_with_comments_xyo3k42eom/metadata/00005-6913e897-0000-2d46-9aaa-14223bce81ce.metadata.json,
gs://trino-ci-test/test_iceberg_biglake_gk2j0xnz67/test_register_table_with_comments_xyo3k42eom/metadata/00005-69218711-0000-2ea5-81a1-c82add7ff230.metadata.json]
https://github.com/trinodb/trino/actions/runs/19693222597/job/56413468126?pr=26219
@talatuyarer Is my understanding correct? Any way to disable the background task?
I guess BigLake metastore created a new metadata file in the background:
Error: Errors: Error: TestIcebergBigLakeMetastoreConnectorSmokeTest>BaseIcebergConnectorSmokeTest.testRegisterTableWithComments:268->AbstractTestQueryFramework.assertUpdate:411->AbstractTestQueryFramework.assertUpdate:416 » QueryFailed More than one latest metadata file found at location: gs://trino-ci-test/test_iceberg_biglake_gk2j0xnz67/test_register_table_with_comments_xyo3k42eom/metadata, latest metadata files are [ gs://trino-ci-test/test_iceberg_biglake_gk2j0xnz67/test_register_table_with_comments_xyo3k42eom/metadata/00005-6913e897-0000-2d46-9aaa-14223bce81ce.metadata.json, gs://trino-ci-test/test_iceberg_biglake_gk2j0xnz67/test_register_table_with_comments_xyo3k42eom/metadata/00005-69218711-0000-2ea5-81a1-c82add7ff230.metadata.json]https://github.com/trinodb/trino/actions/runs/19693222597/job/56413468126?pr=26219
@talatuyarer Is my understanding correct? Any way to disable the background task?
Hey! So what happened here is that the last ADD COMMENT call returned a 429, but it looks like BigLake metastore leaked the metadata file that was meant to be persisted, and when the retry came from the client side, had two metadata files of version 00005.
So BigLake wasn't creating in the background per-se, and this should only rarely happen when the last update before a RegisterTable hits a retryable error.
In any case, that should be fixed, but IMO shouldn't block submission
@Noremac201 Thank you! To clarify, does "that should be fixed" refer to Trino or BigLake metastore?
IMO shouldn’t block submission
Unfortunately, we can't merge this PR while we know there are flaky tests.
If resolving the root cause will take time, an alternative would be to temporarily disallow the register_table procedure for the BigLake metastore integration.
Ah I see.
I meant it should be fixed on the BigLake side, ideally we're not leaking any metadata files if the transaction doesn't commit on the server side.
Is it possible to just disable that one test -- ..._with_comments? The flow of the test with the ~5 updates in succession is triggering the 429, otherwise I'll leave it to @talatuyarer as to whether to disable register table for the time being.
Rebased on master without any changes.
is the build failure related?
No, it's Databricks CI failure #27642
@wendigo @findepi Could you please review this PR when you have time?