iceberg Support client-side purge in REST catalog

Support client-side purge in REST catalog

Open flyrain opened this issue 10 months ago • 4 comments

Proposed Change

The current Rest clients relies on the rest server to delete table files while dropping a table with purging. There are two concerns about this approach:

The rest server isn't necessarily able to access users' storage. It's impossible to delete table files if the server doesn't have the permission.
The rest server may take a performance hit in case of purging table with a large amount of files.

I propose to support the client-side purging, while still allowing server side deletion to be compatible with the current behavior.

Option 1, to put the purge state in a delete table response.

DeleteTableResponse:
  type: object
  properties:
    purged:
      type: boolean

The clients can decide to delete files or not according to the response. If files are deleted in the server side, do nothing; otherwise, delete them in the client side.

Option 2, checking the existence of table files in the client side

The client can check if files exist, then decide to delete them or not. This doesn't need spec changes. Clients will rely on a convention instead of spec, which is a bit ambiguous.

WDYT? Please share your feedback.

cc @RussellSpitzer @aokolnychyi @rdblue @danielcweeks @Fokko

Proposal document

No response

Specifications

[ ] Table
[ ] View
[X] REST
[ ] Puffin
[ ] Encryption
[ ] Other

Apr 05 '24 19:04 flyrain

iceberg iceberg copied to clipboard

Support client-side purge in REST catalog

Proposed Change

Option 1, to put the purge state in a delete table response.

Option 2, checking the existence of table files in the client side

Proposal document

Specifications

iceberg
iceberg copied to clipboard