Flink job will not stop when the mysql database becomes unavailable.
Is your feature request related to a problem? Please describe. I submit a flink job that reads from a mysql database and writes to a kafka cluster by mysql cdc connector. This mysql database offline after a month, but this flink job does not exit. And the BinnaryLogClient will continue to retry endlessly.
Describe the solution you'd like I think mysql cdc connector should react to this error instead of endlessly retrying. Maybe the job should fail after retrying a few times.
Hi @ruanhang1993,
Do you configure the restart strategy like below? The attempts option can limit the max attempts of restarts.
restart-strategy fixed-delay
restart-strategy.fixed-delay.delay: 15s
restart-strategy.fixed-delay.attempts: 10
2022-07-21 19:26:04.114 INFO io.debezium.util.Threads - Creating thread debezium-mysqlconnector-mysql_binlog_source-binlog-client 2022-07-21 19:26:34.137 ERROR System.err - Jul 21, 2022 7:26:34 PM com.github.shyiko.mysql.binlog.BinaryLogClient$5 run WARNING: Failed to restore connection to ipxxxx:portxxxx. Next attempt in 60000ms
2022-07-21 19:27:34.137 ERROR System.err - Jul 21, 2022 7:27:34 PM com.github.shyiko.mysql.binlog.BinaryLogClient$5 run INFO: Trying to restore lost connection to ipxxxx:portxxxx
2022-07-21 19:27:34.137 INFO io.debezium.util.Threads - Creating thread debezium-mysqlconnector-mysql_binlog_source-binlog-client 2022-07-21 19:28:04.140 ERROR System.err - Jul 21, 2022 7:28:04 PM com.github.shyiko.mysql.binlog.BinaryLogClient$5 run WARNING: Failed to restore connection to ipxxxx:portxxxx. Next attempt in 60000ms
2022-07-21 19:28:51.247 INFO org.apache.flink.metrics.lcs.shaded.com.xiaomi.infra.galaxy.lcs.common.file.DiskManager - datadir: /home/work/app/lcs-agent/data, usableSpace: 488 GB 2022-07-21 19:29:04.140 ERROR System.err - Jul 21, 2022 7:29:04 PM com.github.shyiko.mysql.binlog.BinaryLogClient$5 run INFO: Trying to restore lost connection to ipxxxx:portxxxx
The task doesn't exit, it's just that debezium keeps retrying.

Thanks @ruanhang1993 @hezhenghongmail to report this.
if i set connect.keep.alive = false ,cloud solve this problem?
Closing this issue because it was created before version 2.3.0 (2022-11-10). Please try the latest version of Flink CDC to see if the issue has been resolved. If the issue is still valid, kindly report it on Apache Jira under project Flink with component tag Flink CDC. Thank you!