Errors When Applying Binlog for latin1 Character Set
We have a number of tables in our database that unfortunately use the latin1 character set. We need to migrate these to utf8mb4 and are hoping to use gh-ost to facilitate this. In doing some testing, I'm running into a problem when changes from the binlog are being applied. I can reliably produce the problem as follows:
- SQL setup
create table test_charset(id int not null primary key auto_increment, answer text) default charset=latin1; # This is simpler than our real table but illustrates the problem
insert into test_charset(answer) values('Hereâs an apostrophe'); # note that that's a "fancy" apostrophe
- gh-ost command (the strange ports are for Docker containers running locally):
bin/gh-ost --user="root" --password="<the password>" --host="127.0.0.1" --port=5506 --database=loyalty --table="test_charset" --verbose --alter="modify column answer text charset utf8mb4" --exact-rowcount --assume-rbr --postpone-cut-over-flag-file="/tmp/gh-ost-cutover.flag" --max-lag-millis=7500 --assume-master-host="127.0.0.1:4406" --execute
The migration proceeds as expected when copying rows and then waits since the cutover flag file is present. If I then insert the same row again, gh-ost crashes:
insert into test_charset(answer) values('Hereâs an apostrophe');
The gh-ost output is attached. It seems that the value of the answer column is coming through as []uint8 instead of string so the character set conversion in types.go/convertArg isn't happening. The "fancy" apostrophe is valid in the latin1 character set.
Any advice is appreciated.
I believe this would be addressed by #1003