iceberg-python icon indicating copy to clipboard operation
iceberg-python copied to clipboard

PyIceberg is not respecting `token` in the load table response

Open creechy opened this issue 1 year ago • 2 comments

Apache Iceberg version

0.7.1 (latest release)

Please describe the bug 🐞

In the Iceberg REST API spec, the load table endpoint can include a config map with additional properties to configure for when accessing the given table. One of these a potential token property in the LoadTableResult in a section which states

The following configurations should be respected by clients

It doesn't appear that PyIceberg is respecting this property and continues to use the original token supplied in the catalog configuration request. This can lead to incorrect permissions being applied for table operations which in some cases could prevent operations from succeeding when they should.

creechy avatar Aug 28 '24 21:08 creechy

Thanks for reporting this @creechy Do you have an example to reproduce this issue?

For now, I found some more docs on token https://github.com/apache/iceberg/blob/cd32ec76ecd2866c05185065e4ed7196121de49a/open-api/rest-catalog-open-api.yaml#L672-L675

And relevant code in the REST client implementation https://github.com/apache/iceberg-python/blob/e4c1748fee220076f04e35ab2f182dd51ca20705/pyiceberg/catalog/rest.py#L506-L515

kevinjqliu avatar Aug 31 '24 13:08 kevinjqliu

@kevinjqliu

Do you have an example to reproduce this issue?

I provided an example with a Tabular config to @Fokko, who confirmed that PyIceberg does not appear to be switching tokens. FWIW here's the script, probably not all that useful, but if you know how to look at the response of load_table, you should see a config object with a token in it, which PyIceberg is not using. In my case, the original token does not have the necessary privileges to update the table, and so this ends up with a 409 Conflict instead of succeeding.

catalog = load_catalog(
    "raw",
    **{
        "type": "rest",
        "uri": "https://api.tabular.io/ws/",
        "warehouse": "<warehouse>",
        "credential": "<credential>"
    },
)

table = catalog.load_table("default.buddy")

schema = table.schema().as_arrow()

df = pa.Table.from_pylist(
    [{"s": "Groningen"}], schema=schema
)


table.append(df)

creechy avatar Aug 31 '24 15:08 creechy