iceberg-rust icon indicating copy to clipboard operation
iceberg-rust copied to clipboard

feat(core): expose remove_snapshots at Transaction API

Open TennyZhuang opened this issue 11 months ago • 5 comments

A part of #743

For consistency with spark-procedure, may a better name is expire_snapshots. Before we support the entire list of parameters, a remove_snapshots public method that only supports specified snapshot_ids is still useful. We can consider deprecating this API in the future.

TennyZhuang avatar Jan 12 '25 08:01 TennyZhuang

I am not sure if this is a server issue or a client issue, more research is needed.

If only one snapshot_id is passed, it can be successful.

table: TableIdent { namespace: NamespaceIdent(["data", "other"]), name: "1m_market_return_10m" }, snapshot_ids: [3419887305852055718, 3557841878473470571, 1963578191607528169, 3970602252979962677, 2254539859506811453, 7503617340073624129, 4090418433635376432, 3991054311210587887, 8251276597814139233, 1287384849826930466]
Error: Unexpected => Failed to parse response from rest catalog server!

Context:
   code: 500 Internal Server Error
   method: POST
   url: http://localhost:8182/v1/namespaces/data%1Fother/tables/1m_market_return_10m
   json: <html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 500 java.lang.IllegalArgumentException: Invalid set of snapshot ids to remove. Expected one value but received: [3419887305852055718, 3557841878473470571, 1963578191607528169, 3970602252979962677, 2254539859506811453, 7503617340073624129, 4090418433635376432, 3991054311210587887, 8251276597814139233, 1287384849826930466]</title>
</head>
<body><h2>HTTP ERROR 500 java.lang.IllegalArgumentException: Invalid set of snapshot ids to remove. Expected one value but received: [3419887305852055718, 3557841878473470571, 1963578191607528169, 3970602252979962677, 2254539859506811453, 7503617340073624129, 4090418433635376432, 3991054311210587887, 8251276597814139233, 1287384849826930466]</h2>
<table>
<tr><th>URI:</th><td>/v1/namespaces/data%1Fother/tables/1m_market_return_10m</td></tr>
<tr><th>STATUS:</th><td>500</td></tr>
<tr><th>MESSAGE:</th><td>java.lang.IllegalArgumentException: Invalid set of snapshot ids to remove. Expected one value but received: [3419887305852055718, 3557841878473470571, 1963578191607528169, 3970602252979962677, 2254539859506811453, 7503617340073624129, 4090418433635376432, 3991054311210587887, 8251276597814139233, 1287384849826930466]</td></tr>
<tr><th>SERVLET:</th><td>org.apache.iceberg.rest.RESTCatalogServlet-7d9f158f</td></tr>
<tr><th>CAUSED BY:</th><td>java.lang.IllegalArgumentException: Invalid set of snapshot ids to remove. Expected one value but received: [3419887305852055718, 3557841878473470571, 1963578191607528169, 3970602252979962677, 2254539859506811453, 7503617340073624129, 4090418433635376432, 3991054311210587887, 8251276597814139233, 1287384849826930466]</td></tr>
</table>
<hr><a href="https://eclipse.org/jetty">Powered by Jetty:// 11.0.0</a><hr/>

</body>
</html>


Source: expected value at line 1 column 1

TennyZhuang avatar Jan 13 '25 01:01 TennyZhuang

https://github.com/apache/iceberg/blob/c7910bb401f7f7fd09010bede0d80f5d2164afd5/core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java#L547

This is a hard limit in the server implementation. I am not sure about the reason for this design, as I am not familiar enough with the specification of iceberg.

TennyZhuang avatar Jan 13 '25 01:01 TennyZhuang

I am not sure if this is a server issue or a client issue, more research is needed.

Looks like it depends on the catalog. For list of snapshots remove, it will apply to metadata at the client and also append multiple changes (one per removed snapshot). https://github.com/apache/iceberg/blob/2f88ff66d05269b04e3621fe067ccdab668f3191/core/src/main/java/org/apache/iceberg/TableMetadata.java#L1425 And for the rest catalog, the changes will be sent to the server(process at server I guess). https://github.com/apache/iceberg/blob/2f88ff66d05269b04e3621fe067ccdab668f3191/core/src/main/java/org/apache/iceberg/rest/RESTTableOperations.java#L153

ZENOTME avatar Feb 26 '25 09:02 ZENOTME

This is a hard limit in the server implementation. I am not sure about the reason for this design, as I am not familiar enough with the specification of iceberg.

+1, I also encountered this problem. What is confusing is that the semantics of the rest api is inconsistent with the server implementation.

https://github.com/apache/iceberg/blob/3dba6afb789a420373e44ac29ecdef866bd7ebee/core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java#L557

https://github.com/apache/iceberg/blob/6e8718113c08aebf76d8e79a9e2534c89c73407a/open-api/rest-catalog-open-api.yaml#L2823

image

Li0k avatar Mar 12 '25 06:03 Li0k

This is a hard limit in the server implementation. I am not sure about the reason for this design, as I am not familiar enough with the specification of iceberg.

+1, I also encountered this problem. What is confusing is that the semantics of the rest api is inconsistent with the server implementation.

https://github.com/apache/iceberg/blob/3dba6afb789a420373e44ac29ecdef866bd7ebee/core/src/main/java/org/apache/iceberg/MetadataUpdateParser.java#L557

https://github.com/apache/iceberg/blob/6e8718113c08aebf76d8e79a9e2534c89c73407a/open-api/rest-catalog-open-api.yaml#L2823

image

Looks like this hard limit will affect the remove_snapshots implementation. e.g. we must separate multiple remove snapshot id into multiple remove snapshot requests rather than ids in one remove snapshot request. cc @Fokko

ZENOTME avatar Mar 12 '25 06:03 ZENOTME