trino icon indicating copy to clipboard operation
trino copied to clipboard

Jdbc merge support

Open chenjian2664 opened this issue 1 year ago • 8 comments

Description

This is a follow up of https://github.com/trinodb/trino/pull/16693

It mainly implements several important interfaces in the https://trino.io/docs/current/develop/supporting-merge.html#connector-support-for-merge document:

  • ConnectorMergeSink API. In storeMergedRows, the page data is classified into INSERT, UPDATE, DELETE three categories, and then call the corresponding Sink class to operate the corresponding Connector. INSERT data is processed through the existing JdbcPageSink. UPDATE and DELETE type data are modified through the primary keys. The implementation reuses the JdbcPageSink code and only changes the executed sql statement. The syntax used when deleting data using primary key constraints in each database may be different, so when creating the Sink class, mergeRowIdConjuncts will be obtained from BaseJdbcClient (BaseJdbcClient#buildMergeRowIdConjuncts)

  • getMergeRowIdColumnHandle API. For connectors that support Merge, return a RowType containing all primary key column information, otherwise return as original.

  • beginMerge API. Returns JdbcMergeTableHandle, which contains all columns that need to be scanned, (implemented in BaseJdbcClient#updatedScanColumnsForMerge), and the tableHandle of Insert.

  • In the third commit, FTE needs to be supported, so the finishMerge API is also implemented. Write all insert+delete data to the temporary table first, and then update the target table.

Additional context and related issues

There are also some details in https://github.com/trinodb/trino/pull/16944/

Release notes

(x) Release notes are required, with the following suggested text:

# Base Jdbc
* Support SQL MERGE for base jdbc connectors ({issue}`16709`)
* Update Phoenix merge implementation by reusing base Jdbc merge implementation.
* Support update/merge for Ignite
* Support update/merge for Postgresql

chenjian2664 avatar Feb 01 '24 09:02 chenjian2664

@kokosing I divide the https://github.com/trinodb/trino/pull/16944 into those 3 commits, each commit is mainly support merge in one connector(Phoenix, Ignite, Postgresql). Oracle connector will be added latter since it requires to modify some base module tests. I hope this pr is cleaner for your review.

chenjian2664 avatar Feb 02 '24 08:02 chenjian2664

@wendigo @vlad-lyutenko @hashhar would you like to review this PR?

kokosing avatar Feb 05 '24 13:02 kokosing

@electrum Would you mind to have a look when you have sometime? thanks

chenjian2664 avatar Feb 21 '24 09:02 chenjian2664

Postgresql tests failures due to OOM will be resolved https://github.com/trinodb/trino/pull/21127

chenjian2664 avatar Mar 18 '24 03:03 chenjian2664

@hashhar @kokosing Gentle reminder.

chenjian2664 avatar Mar 28 '24 12:03 chenjian2664

@hashhar @kokosing Would you mind to have a review now

chenjian2664 avatar Apr 24 '24 13:04 chenjian2664

Can you please expand description section of the PR with high level overview of approach taken? ie. how it works.

findepi avatar Apr 25 '24 09:04 findepi

@findepi @hashhar @kokosing PTAL

chenjian2664 avatar Apr 30 '24 07:04 chenjian2664

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar May 27 '24 17:05 github-actions[bot]

Hey @chenjian2664, @mosabua, @findepi any update on this PR? We would love to use this feature w/ dbt-trino adapter.

nizarhejazi avatar May 30 '24 18:05 nizarhejazi

Hey @chenjian2664, @mosabua, @findepi any update on this PR? We would love to use this feature w/ dbt-trino adapter.

Waiting for the review

chenjian2664 avatar May 31 '24 01:05 chenjian2664

cc @djsstarburst @electrum for more MERGE support

findepi avatar Jun 03 '24 08:06 findepi

Hey team, any update on the status of this PR?

nizarhejazi avatar Jul 01 '24 17:07 nizarhejazi

Would love to have merge support for postgresql connector, as Update with help of from query is not directly supported. Any timeline when this feature can go live

LovAsawa-Draup avatar Jul 12 '24 14:07 LovAsawa-Draup

We are trying to give this priority in the review and merge queue at the moment. We definitely plan on getting this in. Timing is to be determined though.

mosabua avatar Jul 12 '24 16:07 mosabua

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Jason. This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

cla-bot[bot] avatar Aug 13 '24 02:08 cla-bot[bot]

@chenjian2664 Any update on that issue? And you know if the Oracle connector will also benefit from that? Merge in oracle connector would be very useful for me

shohamyamin avatar Sep 15 '24 06:09 shohamyamin

@shohamyamin for some reason, we move the working on https://github.com/trinodb/trino/pull/23034 now. The workaround is also fit for the Oracle connector.

chenjian2664 avatar Sep 15 '24 13:09 chenjian2664