Flink: Add table.exec.iceberg.use-v2-sink option
This PR adds a table.exec.iceberg.use-v2-sink configuration option allowing to use Flink's Sink v2 API described in the FLIP-143 document.
The configuration option is by default set to false.
This PR is the follow-up of discussion in #10179 and in #11219
@rodmeneses , @pvary please have a look at this PR. In the change, I added the configuration option that we talked about. I also added the documentation. In the docs I showed differences between the interfaces that I found while working on unifying the unit tests for both sinks.
This is still a draft, as I have no idea how to show in unit tests, that this configuration option does anything 😅. I've only checked in the debugger that the code is entering the correct code blocks. I could add a test on IcebergTableSink.getSinkRuntimeProvider level that would check which implementation is chosen but that doesn't seem like a good idea.
Do you know of any differences in the behaviour of these sinks that I could show in the test? I see that there are plans to add small file compaction in SinkV2 based implementation. Perhaps we should add this test when this feature will be available. WDYT?
This is still a draft, as I have no idea how to show in unit tests, that this configuration option does anything 😅. I've only checked in the debugger that the code is entering the correct code blocks. I could add a test on
IcebergTableSink.getSinkRuntimeProviderlevel that would check which implementation is chosen but that doesn't seem like a good idea.
I found out the way to verify if the correct API used. I added the test. This change is ready for review.
@rodmeneses, @stevenzwu: Please review the PR. Especially the documentation part.
@rodmeneses @pvary I've done all fixes that were discussed in comments. Can we move forward with this change? Do we need more eyes to take a look at the docs part? @stevenzwu can you take a look?
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.
Hi @arkadius is this still needed/relevant ? please advise, as it will be closed due to inactivity soon cc @pvary
I think that this option would benefit this project in a good transitional migration to the new interface in Table API / Flink SQL. What is the alternative plan for the migration between the old implementation and the new one?
We definitely need this. @arkadius: Could you please make sure the tests are run? Just change the commit message of the last commit and force push the branch, or something?
@arkadius: Could you please make sure the tests are run? Just change the commit message of the last commit and force push the branch, or something?
Sure, I rebased the branch to main to make sure that all tests will be run on the state that will be present after merge
Thanks for the PR @arkadius and @rodmeneses for the review!
@arkadius: Could you please backport the change to the relevant older Flink branches?
@arkadius: Could you please backport the change to the relevant older Flink branches?
Sure, here is backport for Flink 1.19: #11665 The SinkV2 API was added in Flink 1.19 so it is only relevant for this version.