fix bug 'mysql primary key of binary data type is compared incorrectly'
mysql table use the id column of binary type as the primary key, when mysql cdc do the snapshot split. It will get min and max value of id column by execute sql of 'SELECT MIN(id), MAX(id) FROM table'. Then the values of splitStart and splitEnd are determined by dividing the range of maximum and minimum values by paging, and confirm if the splitEnd is bigger than the max value. But cdc use the "obj1.toString().compareTo(obj2.toString())" to compare the byte[], This comparison is different with mysql, and caused the middle splitEnd bigger than the max value. Then cdc use the sql of "select * from table where id > ? " to load the total line of the table, and make the taskmanger oom
Thanks for the contribution @stgztsw , could you add a unit test?
I found some problem with the source, i will fix it and retest, then add the unit test later, sorry for inconvenience.
hello, I pushed again, please help me to review it, thanks.
Thanks for contributiom. @stgztsw Please rebase the master branch. There are some updates in the master.
any progress of this request ? seems still have this issues
Hi @stgztsw, thanks for your contribution! Before we can merge this PR, could you please rebase it with the latest master branch? You may need to move com.ververica.cdc.connectors to org.apache.flink.cdc.connectors, and move flink-connector-mysql-cdc to flink-cdc-connect/flink-cdc-source-connectors.
This pull request has been automatically marked as stale because it has not had recent activity for 60 days. It will be closed in 30 days if no further activity occurs.