dbt-clickhouse icon indicating copy to clipboard operation
dbt-clickhouse copied to clipboard

insert+replace partitions incremental strategy

Open bryzgaloff opened this issue 2 years ago • 7 comments

Summary

This PR implements insert+replace strategy discussed in #128 which does the following:

  • Creates a new staging table with the same structure as the target table.
  • Inserts data into the staging table.
  • Replaces partitions in the target table from the staging table.

Advantages:

  • Only the involved partitions are replaced: this is much cheaper than reinserting the full table which is implemented in other strategies.
  • If an insertion fails, the target table is not affected.

Checklist

Delete items not relevant to your PR:

  • [x] Unit and integration tests covering the common scenarios were added
  • [x] A human-readable description of the changes was provided to include in CHANGELOG
  • [x] For significant changes, documentation in https://github.com/ClickHouse/clickhouse-docs was updated with further explanations or tutorials

bryzgaloff avatar Oct 26 '23 17:10 bryzgaloff

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Oct 26 '23 17:10 CLAassistant

At the time of publishing the PR is in WIP (work in progress) state since I need an advice from the community. Thus, neither documentation nor tests are updated.

bryzgaloff avatar Oct 26 '23 17:10 bryzgaloff

Hi @simpl1g and @genzgd, I've wrapped up this PR for insert+replace strategy implementation: added some fancy documentation in the README and threw in integration tests (taking a page from lw-deletes). Motivation for the feature is given in the PR's description.

Some discussions above are unresolved, waiting on your pearls of wisdom there :)

Please take a peek at the PR when you get a chance, and hit me up with your thoughts. Hoping we can merge the feature soon and let the good times roll with the new strategy! :)

bryzgaloff avatar Nov 17 '23 20:11 bryzgaloff

Hi @simpl1g and @genzgd! This is a kind reminder about my PR which is ready for your review. We have successfully battle-tested it internally

We install my version from GitHub currently. It would be nice if you can approve it and release to PyPI. If any adjustments are required, please let me know! 🙏

bryzgaloff avatar Dec 08 '23 13:12 bryzgaloff

@bryzgaloff I apologize that we haven't yet had the resources to fully review this PR. As you may have noticed we've been focused on bug fixes and compatibility with the new dbt releases. Please know that we very much appreciate the contribution (especially with test cases and real world usage) and your work is next on the roadmap as we get time.

If you have a chance to resolve the conflicts over the next few weeks that would be appreciated and make the review just a bit easier.

genzgd avatar Dec 15 '23 15:12 genzgd

@bryzgaloff Thanks again for your contribution! I would like to review your PR and merge it within the next few days. Can you sync your fork with the main repo? We upgraded dbt-core to 1.8.0.

After syncing I'll review this one right away.

BentsiLeviav avatar Jul 03 '24 06:07 BentsiLeviav

Hi @BentsiLeviav thank you (and the other reviewers, of course!) for your participation. I am not actively using the plugin right now, but I may get back to handling your review feedback late next week. If the conflicts are not too critical, I might be able to resolve them quickly.

bryzgaloff avatar Jul 05 '24 13:07 bryzgaloff

Hi @BentsiLeviav I have rebased my contribution onto the latest main of this repository. Also, I have renamed the strategy to insert_replace. Please have a look! 👋

bryzgaloff avatar Jul 15 '24 15:07 bryzgaloff

Does this repo has automated tests? I see there are workflows to be approved.

I do not have a quick infra to retest the contribution after the rebase, but if there is not testing workflow, I will perform the manual testing.

bryzgaloff avatar Jul 15 '24 15:07 bryzgaloff

Hi @BentsiLeviav and @genzgd — will you have a chance review and merge the PR soon? I have updated everything according to the review feedback and all the checks have passed. I would like to avoid the need to rebase it again 😅

bryzgaloff avatar Jul 22 '24 12:07 bryzgaloff

Hi @bryzgaloff Huge thanks for this. I will review and update you within the next few days.

BentsiLeviav avatar Jul 24 '24 09:07 BentsiLeviav

Thank you @BentsiLeviav for the approval! What are the next steps for the PR to be merged?

bryzgaloff avatar Jul 29 '24 13:07 bryzgaloff

Hi @bryzgaloff Sure, thank you for your contribution!

Before merging this, could you please add to the doc that this feature is experimental, and wasn't tested with cluster setup? It is crucial to highlight these 2 points.

Once we are done with that, I'll merge your PR.

Thanks again for your work!

BentsiLeviav avatar Jul 30 '24 07:07 BentsiLeviav

Never mind, I'll take care of it :)

BentsiLeviav avatar Aug 01 '24 10:08 BentsiLeviav

Thank you for all you help and merge, @BentsiLeviav! 🤝

bryzgaloff avatar Aug 01 '24 15:08 bryzgaloff