flink-cdc icon indicating copy to clipboard operation
flink-cdc copied to clipboard

[FLINK-35317][cli] Supports submitting multiple pipeline jobs at once

Open yuxiqian opened this issue 1 year ago • 2 comments

This closes FLINK-35317.

Currently, Flink CDC CLI only allows submitting one YAML pipeline job each time. This PR allows submitting multiple .yml files at once like this:

./bin/flink-cdc.sh job1.yml job2.yml job3.yml --flink-home /opt/flink ...

prints the following output:

Pipeline has been submitted to cluster.
Job ID: 60f6e58bd3c0f54ee51b703d7e59b9ca
Job Description: Dummy Pipeline 1
Pipeline has been submitted to cluster.
Job ID: e4403683aaacf185f87db4a769a1db74
Job Description: Dummy Pipeline 2
Pipeline has been submitted to cluster.
Job ID: 350737f6e5e3d7cfaa0b17f77e5a5035
Job Description: Dummy Pipeline 3

yuxiqian avatar May 09 '24 09:05 yuxiqian

I don't know whether we need to do it? Foe example, what happened if fist job is submitted success but the later is fail. Will we cancel the first one?

loserwang1024 avatar May 10 '24 02:05 loserwang1024

Hi @loserwang1024, the original idea is pipeline definition does not allow defining source & sink with different connectors for now, and user must write separated YAML files to define hetero-source / hetero-sink pipelines. For now CLI just submits all of them as individual jobs without caring if they run successfully or fail, just like a convenient alternative to executing ./flink-cdc.sh for each YAML file.

In the future, CDC Composer might be able to do some optimization by reusing sources / sinks and compose merged operator topologies without changing the CLI frontend interface.

yuxiqian avatar May 10 '24 02:05 yuxiqian

This pull request has been automatically marked as stale because it has not had recent activity for 60 days. It will be closed in 30 days if no further activity occurs.

github-actions[bot] avatar Sep 15 '24 00:09 github-actions[bot]