iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

Add StartTransaction API to REST multi-table transaction support

Open jackye1995 opened this issue 1 year ago • 1 comments

Proposed Change

We have identified an issue with the current commit-only multi-table transaction support. The proposal provides an analysis of the isolation guarantee it could potentially break, and offers a solution of introducing StartTransaction to solve this problem.

Proposal document

https://docs.google.com/document/d/10tfqETygf2BLA34CoZLxK3v5xk1BWUNKFA9WE8X_w-U/edit#heading=h.kbf1q7197nxq

Specifications

  • [X] Table
  • [ ] View
  • [X] REST
  • [ ] Puffin
  • [ ] Encryption
  • [ ] Other

jackye1995 avatar Jul 01 '24 14:07 jackye1995

Just adding the original doc for reference: https://docs.google.com/document/d/1UxXifU8iqP_byaW4E2RuKZx1nobxmAvc5urVcWas1B8/edit#heading=h.6sa1rpsxiuke

danielcweeks avatar Jul 01 '24 20:07 danielcweeks

I just wanted to clarify that what's currently in REST is not multi-table transaction support. It's a pure endpoint that allows a multi-table commit without actually providing any API semantics around transaction Isolation. Adding actual multi-table transaction support is what's being described in https://docs.google.com/document/d/1UxXifU8iqP_byaW4E2RuKZx1nobxmAvc5urVcWas1B8/edit#heading=h.6sa1rpsxiuke and there's a prototype available in https://github.com/apache/iceberg/pull/6948.

Given that #6948 isn't done yet it seems too early to talk about REST-related changes for multi-table transaction support - unless you had something else in mind here @jackye1995?

nastra avatar Jul 02 '24 16:07 nastra

I see, thanks for the context, I remember this PR, I thought the conclusion was to just do multi-table commit. What about we just use this proposal to track the full "multi-table transaction" support? Because I think the full support entails the concept of starting a transaction, or createTransaction in your API that needs to be server-aware. We can discuss these 2 proposals together. What do you think?

jackye1995 avatar Jul 02 '24 16:07 jackye1995

We can definitely rename this proposal to track the Catalog Transaction API support aka multi-table transactions but I don't recall that we have concluded on just doing a multi-table commit. I'll rename this proposal to reflect the work mentioned in the doc and we can add anything else that needs to be discussed on top of that.

nastra avatar Jul 02 '24 16:07 nastra

I don't recall that we have concluded on just doing a multi-table commit.

yeah that's probably just my misunderstanding, since multi-table commit was what was eventually added.

So just to be clear, this will be only for the REST catalog right? Do we consider this feature also for other catalogs? Because I see you write that in the Google doc "Implementing multi-table TX support for other catalogs" is a non-goal, but I did not see any OpenAPI specification description in the doc.

jackye1995 avatar Jul 02 '24 17:07 jackye1995

So just to be clear, this will be only for the REST catalog right? Do we consider this feature also for other catalogs? Because I see you write that in the Google doc "Implementing multi-table TX support for other catalogs" is a non-goal, but I did not see any OpenAPI specification description in the doc.

The scope of the design doc / impl is to add all of the required core APIs in order to support multi-table transactions in the first place. Adding support for REST would be the next logical step in showing that multi-table transactions actually work. The APIs need to be designed in a way that other catalogs would theoretically be able to support multi-table transactions but in practice only REST / Nessie might be able to support it.

The reason I haven't done any REST spec work yet is because the core APIs and the impl hasn't been solidified yet and my focus back then shifted to adding view support.

nastra avatar Jul 03 '24 06:07 nastra

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] avatar Dec 31 '24 00:12 github-actions[bot]

Guys Any update on Multi-table Transactions progress with Nessie or Rest Catalog

heman026 avatar Jul 03 '25 05:07 heman026

@heman026 I didn't have resources to get back to this proposal but there's another proposal being discussed here: https://lists.apache.org/thread/r5otylsrm4txd4oxyv7c6scdwrbolck9

nastra avatar Jul 03 '25 08:07 nastra