seatunnel
seatunnel copied to clipboard
[Feature][Connector-V2 E2E] Data consistency test process design
Search before asking
- [X] I had searched in the feature and found no similar feature requirement.
Description
We need support data consistency test in connector v2 e2e. I have some idea about it and welcome everyone to discuss.
Test Sink Connector
Fake Source Connector
If we want to test the data consistency of a sink connector, We can use the Fake Source
connector. The Fake Source Connector support define row numbers and Primary key fields in the feature. Defile Primary key fields is useful to test exactly-once sink which implement exactly-once by Idempotent write data. If we can simulate task failure and then restore task, We can complete the data consistency test.
How to simulate task failure and restore task.
I think we can use Fake Source
connector to simulate task failure too. We can add some active triggering failure function in Fake Source. To ensure Fake Source can support read playback, the Fake Source need support snapshot too.
How to check data
We can check the rows that wrote in sink.
Test Source Connector
Test JDBC Sink Connector
If we want to test the data consistency of a source connector, We can add a Test JDBC Sink
connector. It need support exactly-once
.
How to simulate task failure and restore task.
We can add some active triggering failure function in Test JDBC Sink
connector.
How to check data
There are two ways to do it.
First one: After the job xxxSource -> TestJDBCSink
finished, we can automatically create a job JDBCSource -> AssertSink
and use AssertSink to check data.
shortcoming This way need run two jobs.
advantage This way can do test standardization, people only need config the check rules in AssertSink
connector.
The second one is add a java program to check data in MySQL/PG
.
shortcoming This way can not do test standardization, every source connector e2e need add the check program and define the check rules themselves.
advantage only need run one job.
mail list: https://lists.apache.org/thread/148v3w2tbz8byxwnwbk46mkgzoj600w5
Usage Scenario
No response
Related issues
No response
Are you willing to submit a PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
@getChan @2013650523 @531651225 @laglangyue @legendtkl @leo65535 @lhyundeadsoul @hailin0 @Hisoka-X @ashulin @ic4y @TyrantLucifer @iture123 and all people who may be interested in this question, Do you have any suggestions?
This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.
This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.