flink-cdc icon indicating copy to clipboard operation
flink-cdc copied to clipboard

[tidb-cdc] Channel shutdown invoked

Open RaiToKU opened this issue 3 years ago • 1 comments

Describe the bug(Please use English) [tidb-cdc] Prompt after running for a period of time "Channel shutdown invoked"

Environment :

  • Flink version : 1.13.5/1.13.6/1.14.4
  • Flink CDC version: 2.2.0/2.2.1
  • Database and version: v5.4/v6.0

To Reproduce Steps to reproduce the behavior:

  1. Thes test data :
mysql> select count(id) from source_table;
+-----------+
| count(id) |
+-----------+
|    761920 |
+-----------+
1 row in set (0.61 sec)
mysql> show table source_table regions;
+-----------+---------------------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+
| REGION_ID | START_KEY                                                           | END_KEY                                                                             | LEADER_ID | LEADER_STORE_ID | PEERS               | SCATTERING | WRITTEN_BYTES | READ_BYTES | APPROXIMATE_SIZE(MB) | APPROXIMATE_KEYS |
+-----------+---------------------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+
|     16637 | t_1640_i_2_015343383538333933ff3836373233363138ff0000000000000000f7 | t_1640_r_57561                                                                      |     16640 |               5 | 16638, 16639, 16640 |          0 |             0 |  239604046 |                   96 |           917109 |
|     17049 | t_1640_r_57561                                                      | t_1640_r_180751                                                                     |     17050 |               1 | 17050, 17051, 17052 |          0 |             0 |  524664294 |                   54 |           122881 |
|     16601 | t_1640_r_180751                                                     | t_1640_r_367125                                                                     |     16604 |               5 | 16602, 16603, 16604 |          0 |             0 |  851717262 |                   76 |           204800 |
|     17109 | t_1640_r_367125                                                     | t_1640_r_608667                                                                     |     17111 |               4 | 17110, 17111, 17112 |          0 |             0 | 1101693912 |                   94 |           245760 |
|      9181 | t_1640_r_608667                                                     | t_1643_5f698000000000000001016d616c6c31383138ff3631383631363233ff3035310000000000fa |      9183 |               4 | 9182, 9183, 9184    |          0 |             0 |  703376597 |                   64 |           164126 |
|     16569 | t_1622_                                                             | t_1640_i_1_01736d617274626f78ff3632373935303835ff3730363133370000fd                 |     16571 |               4 | 16570, 16571, 16572 |          0 |           561 |        180 |                   69 |           768937 |
|     16393 | t_1640_i_1_01736d617274626f78ff3632373935303835ff3730363133370000fd | t_1640_i_2_015343383538333933ff3836373233363138ff0000000000000000f7                 |     16394 |               1 | 16394, 16395, 16396 |          0 |           388 |         40 |                   51 |           647180 |
+-----------+---------------------------------------------------------------------+-------------------------------------------------------------------------------------+-----------+-----------------+---------------------+------------+---------------+------------+----------------------+------------------+
7 rows in set (0.02 sec)
  1. The test code :
CREATE TABLE source_table (
  id STRING,
  create_time TIMESTAMP(0),
  update_time TIMESTAMP(0),
  `user_id` BIGINT,
  `version` INT,
  order_no STRING,
  pay_type BIGINT,
  pay_status INT,
  data_source INT,
  amount DOUBLE,
  tenant_id BIGINT,
  PRIMARY KEY (pay_id) NOT ENFORCED
) WITH (
  'connector' = 'tidb-cdc',
  'tikv.grpc.timeout_in_ms' = '20000',
  'tikv.grpc.scan_timeout_in_ms' = '20000',
  'pd-addresses' = '****',
  'scan.startup.mode' = 'latest-offset',
  'database-name' = '****',
  'table-name' = '****'
);
CREATE TABLE sink_table_print(
  id STRING,
  create_time TIMESTAMP(0),
  update_time TIMESTAMP(0),
  `user_id` BIGINT,
  `version` INT,
  order_no STRING,
  pay_type BIGINT,
  pay_status INT,
  data_source INT,
  amount DOUBLE,
  tenant_id BIGINT,
  PRIMARY KEY (id) NOT ENFORCED
) WITH (
  'connector' = 'print'
);

INSERT INTO
  sink_table_print
SELECT
  o.id AS id,
  o.create_time,
  o.update_time,
  o.`user_id`,
  o.`version`,
  o.order_no,
  o.pay_order_no,
  o.pay_type,
  o.pay_status,
  o.data_source,
  o.data_source,
  o.amount,
  o.tenant_id AS tenant_id
FROM
  source_table AS o;
  1. The error : running for a period of time
2022-06-06 15:07:08,844 INFO  org.tikv.cdc.CDCClient                                       [] - handle resolvedTs: 433717045835857921, regionId: 9181
2022-06-06 15:07:08,920 INFO  org.tikv.cdc.CDCClient                                       [] - handle resolvedTs: 433717045862334466, regionId: 17109
2022-06-06 15:07:09,846 INFO  org.tikv.cdc.CDCClient                                       [] - handle resolvedTs: 433717046098264068, regionId: 9181
2022-06-06 15:07:09,921 INFO  org.tikv.cdc.CDCClient                                       [] - handle resolvedTs: 433717046124478466, regionId: 17109
2022-06-06 15:07:10,921 INFO  org.tikv.cdc.CDCClient                                       [] - handle resolvedTs: 433717046360145921, regionId: 9181
2022-06-06 15:07:11,001 INFO  org.tikv.cdc.CDCClient                                       [] - handle resolvedTs: 433717046399729667, regionId: 17109
2022-06-06 15:07:11,855 INFO  org.tikv.cdc.CDCClient                                       [] - handle resolvedTs: 433717046622289922, regionId: 9181
2022-06-06 15:07:12,117 INFO  org.tikv.cdc.CDCClient                                       [] - handle resolvedTs: 433717046399729667, regionId: 17109
2022-06-06 15:07:12,311 ERROR org.tikv.cdc.RegionCDCClient                                 [] - region CDC error: region: 16637, error: org.tikv.shade.io.grpc.StatusRuntimeException: UNAVAILABLE: Keepalive failed. The connection is likely gone
2022-06-06 15:07:12,312 ERROR org.tikv.cdc.RegionCDCClient                                 [] - region CDC error: region: 17049, error: org.tikv.shade.io.grpc.StatusRuntimeException: UNAVAILABLE: Keepalive failed. The connection is likely gone
2022-06-06 15:07:12,313 INFO  org.tikv.cdc.CDCClient                                       [] - handle error: org.tikv.shade.io.grpc.StatusRuntimeException: UNAVAILABLE: Keepalive failed. The connection is likely gone, regionId: 16637
2022-06-06 15:07:12,314 INFO  org.tikv.cdc.CDCClient                                       [] - remove regions: [16637]
2022-06-06 15:07:12,314 INFO  org.tikv.cdc.RegionCDCClient                                 [] - close (region: 16637)
2022-06-06 15:07:12,314 INFO  org.tikv.cdc.RegionCDCClient                                 [] - terminated (region: 16637)
2022-06-06 15:07:12,317 INFO  org.tikv.cdc.CDCClient                                       [] - remove regions: [17049]
2022-06-06 15:07:12,317 INFO  org.tikv.cdc.RegionCDCClient                                 [] - close (region: 17049)
2022-06-06 15:07:12,317 INFO  org.tikv.cdc.RegionCDCClient                                 [] - terminated (region: 17049)
2022-06-06 15:07:12,317 INFO  org.tikv.cdc.CDCClient                                       [] - add regions: [{Region[16637] ConfVer[5] Version[793] Store[4] KeyRange[t\200\000\000\000\000\000\006h_i\200\000\000\000\000\000\000\002\001SC858393\37786723618\377\000\000\000\000\000\000\000\000\367]:[t\200\000\000\000\000\000\006h_r\200\000\000\000\000\000\340\331]}, {Region[17049] ConfVer[5] Version[785] Store[4] KeyRange[t\200\000\000\000\000\000\006h_r\200\000\000\000\000\000\340\331]:[t\200\000\000\000\000\000\006h_r\200\000\000\000\000\002\302\017]}], timestamp: 433583509931556869
2022-06-06 15:07:12,317 INFO  org.tikv.cdc.RegionCDCClient                                 [] - start streaming region: 16637, running: true
2022-06-06 15:07:12,318 INFO  org.tikv.cdc.RegionCDCClient                                 [] - start streaming region: 17049, running: true
2022-06-06 15:07:12,318 ERROR org.tikv.cdc.RegionCDCClient                                 [] - region CDC error: region: 16637, error: org.tikv.shade.io.grpc.StatusRuntimeException: UNAVAILABLE: Channel shutdown invoked
2022-06-06 15:07:12,318 INFO  org.tikv.cdc.CDCClient                                       [] - keyRange applied
2022-06-06 15:07:12,318 ERROR org.tikv.cdc.RegionCDCClient                                 [] - region CDC error: region: 17049, error: org.tikv.shade.io.grpc.StatusRuntimeException: UNAVAILABLE: Channel shutdown invoked
2022-06-06 15:07:12,341 INFO  org.tikv.cdc.CDCClient                                       [] - handle error: org.tikv.shade.io.grpc.StatusRuntimeException: UNAVAILABLE: Keepalive failed. The connection is likely gone, regionId: 17049
2022-06-06 15:07:12,342 INFO  org.tikv.cdc.CDCClient                                       [] - remove regions: [17049]
2022-06-06 15:07:12,342 INFO  org.tikv.cdc.RegionCDCClient                                 [] - close (region: 17049)
2022-06-06 15:07:12,342 INFO  org.tikv.cdc.RegionCDCClient                                 [] - terminated (region: 17049)
2022-06-06 15:07:12,343 INFO  org.tikv.cdc.CDCClient                                       [] - remove regions: [16637]
2022-06-06 15:07:12,343 INFO  org.tikv.cdc.RegionCDCClient                                 [] - close (region: 16637)

Additional Description (1)insert some data into TIDB source table, but no change events 28E55AF15F315F547BB20F101961F96A

mysql> select count(id) from source_table;
+-----------+
| count(id) |
+-----------+
|    761923 |
+-----------+
1 row in set (0.61 sec)

I try change Flink、Flink-CDC、TiDB version,but it didn't change the result (2)I have observed that the region that listens to tidb seems to be inconsistent with tidb after a period of time, it is just my guess, I suspect that the tidb client does not refresh the region topology in time, because I try to listen to a table with only one region, it works without seeing exceptions

mysql> show table biz_table regions;
+-----------+-----------------------------+-----------------------------+-----------+-----------------+------------------+------------+---------------+------------+----------------------+------------------+
| REGION_ID | START_KEY                   | END_KEY                     | LEADER_ID | LEADER_STORE_ID | PEERS            | SCATTERING | WRITTEN_BYTES | READ_BYTES | APPROXIMATE_SIZE(MB) | APPROXIMATE_KEYS |
+-----------+-----------------------------+-----------------------------+-----------+-----------------+------------------+------------+---------------+------------+----------------------+------------------+
|      6737 | t_1142_5f72800022b5f7868877 | t_1295_5f720000000000000000 |      6738 |               1 | 6738, 6739, 6740 |          0 |             0 |    4224980 |                   72 |           603294 |
+-----------+-----------------------------+-----------------------------+-----------+-----------------+------------------+------------+---------------+------------+----------------------+------------------+
1 row in set (0.01 sec)

RaiToKU avatar Jun 07 '22 03:06 RaiToKU

Me too.Did the owner solve it?

MengXiangDing avatar Sep 07 '22 12:09 MengXiangDing

Closing this issue because it was created before version 2.3.0 (2022-11-10). Please try the latest version of Flink CDC to see if the issue has been resolved. If the issue is still valid, kindly report it on Apache Jira under project Flink with component tag Flink CDC. Thank you!

PatrickRen avatar Feb 28 '24 15:02 PatrickRen