Issue in Reading Iceberg tables in Nessie + Minio using Pyiceberg
Question
I am getting "Failed to read table metadata from s3a://iceberg-datalake/test/emp_69182e21-1700-4317-9f75-55fca8d57979/metadata/00002-d7b6a027-3d3d-4a1f-9350-ce019969cc2e.metadata.json" when loading table using PyIceberg Rest Catalog. I am using Nessie catalog and minio for storage.
catalog = load_catalog("rest",
**{
"uri": "http://10.55.134.161:19120/iceberg",
"s3.endpoint": "http://10.55.134.161:9000",
"warehouse": "warehouse",
"s3.access-key-id": "minioadmin",
"s3.secret-access-key": "minioadmin"
}, )
con = catalog.load_table('test.emp')
Nessie Configuration Used:
java
-Dquarkus.management.port=9090
-Dnessie.version.store.type=JDBC
-Dquarkus.datasource.jdbc.url=jdbc:postgresql://localhost:5432/nessie_db
-Dquarkus.datasource.username=nessie
-Dquarkus.datasource.password=nessie
-Dnessie.catalog.default-warehouse=warehouse
-Dnessie.catalog.warehouses.warehouse.location=s3a://iceberg-datalake
-Dnessie.catalog.service.s3.default-options.endpoint=http://10.55.134.161:9000
-Dnessie.catalog.service.s3.default-options.path-style-access=true
-Dnessie.catalog.service.s3.default-options.access-key=minioadmin
-Dnessie.catalog.service.s3.default-options.secret-key=minioadmin
-Dnessie.server.authentication.enabled=false
-Dnessie.catalog.service.s3.default-options.region=us-east-1
-jar nessie-quarkus-0.100.2-runner.jar
Error
Exception has occurred: BadRequestError
IllegalArgumentException: java.util.concurrent.CompletionException: java.lang.RuntimeException: Failed to read table metadata from s3a://iceberg-datalake/test/emp_69182e21-1700-4317-9f75-55fca8d57979/metadata/00002-d7b6a027-3d3d-4a1f-9350-ce019969cc2e.metadata.json
File "C:\pyiceberg\catalog\rest.py", line 697, in load_table response.raise_for_status() requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http://10.55.134.161:19120/iceberg/v1/main%7Cwarehouse/namespaces/test/tables/emp The above exception was the direct cause of the following exception: File "C:\pyiceberg\catalog\rest.py", line 476, in _handle_non_200_response raise exception(response) from exc File "C:\Hemanath\KAI\Iceberg Evaluation\Docker\duck\pyiceberg1\catalog\rest.py", line 699, in load_table self._handle_non_200_response(exc, {404: NoSuchTableError}) File "C:\duck.py", line 55, in
con = catalog.load_table('test.emp') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pyiceberg1.exceptions.BadRequestError: IllegalArgumentException: java.util.concurrent.CompletionException: java.lang.RuntimeException: Failed to read table metadata from s3a://iceberg-datalake/test/emp_69182e21-1700-4317-9f75-55fca8d57979/metadata/00002-d7b6a027-3d3d-4a1f-9350-ce019969cc2e.metadata.json`
Note: I also disabled s3 request signing in Nessie (-Dnessie.catalog.service.s3.default-options.request-signing-enabled=false), but still getting the same error.
Please help me resolve this. Thanks
@heman026 Thanks for raising this issue. I'm not super familiar with Nessie, but I do notice that the warehouse configuration should be an s3 path: s3a://iceberg-datalake/
Hi @Fokko
I have similar question on reading Iceberg table from nessie server
I set up nessie server locally, and I would like to access the Iceberg table.
Here is the output of response = requests.get("http://localhost:19120/api/v1/config", auth=HTTPBasicAuth("test-nessie", "test-nessie"))
Response JSON: {'defaultBranch': 'main', 'maxSupportedApiVersion': 2}
from pyiceberg.catalog import load_catalog
catalog = load_catalog(
"nessie",
uri= "http://localhost:19120/api",
ref= "main",
authentication={
"type": "BASIC",
"username": "test-nessie",
"password": "test-nessie"
}
)
This code cannot work due to 2 validation errors for ConfigResponse
ConfigResponse is expect the format
class ConfigResponse(IcebergBaseModel):
defaults: Properties = Field()
overrides: Properties = Field()
However, the output of response is defaultBranch and maxSupportedApiVersion Any thoughts on how to read local server?
@HungYangChang Check this https://github.com/apache/iceberg-python/issues/1524
good catch @heman026, did that resolve your issue?
It is not working for me so far, I am still trying
My config:
print ("Initializing Nessie client...")
catalog = load_catalog(
"rest",
**{
"uri": "http://10.3.120.105:19120/iceberg",
"authentication.type": "BASIC",
"authentication.username": "test-nessie",
"authentication.password": "test-nessie",
},
)
print("Set up correctly")
Still same error
pydantic_core._pydantic_core.ValidationError: 2 validation errors for ConfigResponse
defaults
Field required [type=missing, input_value={'defaultBranch': 'main',...SupportedApiVersion': 2}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/missing
overrides
Field required [type=missing, input_value={'defaultBranch': 'main',...SupportedApiVersion': 2}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/missing
I also try http://localhost:19120/iceberg, it doesn't work either
My config:
pyiceberg=0.8.1
Btw, I can confirm nessie server is set up correctly:
under: http://localhost:19120/content/main/demo_0128_v3/names
@HungYangChang i would recommend looking at nessie documentation on how to connect to an iceberg rest catalog. Pyiceberg accepts the standard iceberg rest catalog configurations https://py.iceberg.apache.org/configuration/#rest-catalog
@HungYangChang did you resolve that? i have same issue
This is my working setup. Configuration of Nessie might differ for you. https://github.com/apache/iceberg-python/issues/1524#issuecomment-2683847774 Can you check if this is working.