flink-cdc icon indicating copy to clipboard operation
flink-cdc copied to clipboard

fix bug 'mysql primary key of binary data type is compared incorrectly'

Open stgztsw opened this issue 3 years ago • 7 comments

mysql table use the id column of binary type as the primary key, when mysql cdc do the snapshot split. It will get min and max value of id column by execute sql of 'SELECT MIN(id), MAX(id) FROM table'. Then the values of splitStart and splitEnd are determined by dividing the range of maximum and minimum values by paging, and confirm if the splitEnd is bigger than the max value. But cdc use the "obj1.toString().compareTo(obj2.toString())" to compare the byte[], This comparison is different with mysql, and caused the middle splitEnd bigger than the max value. Then cdc use the sql of "select * from table where id > ? " to load the total line of the table, and make the taskmanger oom

stgztsw avatar Dec 09 '22 02:12 stgztsw

Thanks for the contribution @stgztsw , could you add a unit test?

I found some problem with the source, i will fix it and retest, then add the unit test later, sorry for inconvenience.

stgztsw avatar Dec 12 '22 11:12 stgztsw

hello, I pushed again, please help me to review it, thanks.

stgztsw avatar Dec 14 '22 01:12 stgztsw

Thanks for contributiom. @stgztsw Please rebase the master branch. There are some updates in the master.

ruanhang1993 avatar Jul 05 '23 09:07 ruanhang1993

any progress of this request ? seems still have this issues

chenxu2656 avatar Nov 24 '23 07:11 chenxu2656

Hi @stgztsw, thanks for your contribution! Before we can merge this PR, could you please rebase it with the latest master branch? You may need to move com.ververica.cdc.connectors to org.apache.flink.cdc.connectors, and move flink-connector-mysql-cdc to flink-cdc-connect/flink-cdc-source-connectors.

yuxiqian avatar Apr 26 '24 02:04 yuxiqian

This pull request has been automatically marked as stale because it has not had recent activity for 60 days. It will be closed in 30 days if no further activity occurs.

github-actions[bot] avatar Jul 17 '24 00:07 github-actions[bot]