gh-ost icon indicating copy to clipboard operation
gh-ost copied to clipboard

Errors When Applying Binlog for latin1 Character Set

Open scarpenter opened this issue 6 years ago • 1 comments

We have a number of tables in our database that unfortunately use the latin1 character set. We need to migrate these to utf8mb4 and are hoping to use gh-ost to facilitate this. In doing some testing, I'm running into a problem when changes from the binlog are being applied. I can reliably produce the problem as follows:

  1. SQL setup
create table test_charset(id int not null primary key auto_increment, answer text) default charset=latin1; # This is simpler than our real table but illustrates the problem
insert into test_charset(answer) values('Here’s an apostrophe'); # note that that's a "fancy" apostrophe
  1. gh-ost command (the strange ports are for Docker containers running locally):
bin/gh-ost --user="root" --password="<the password>" --host="127.0.0.1" --port=5506 --database=loyalty --table="test_charset" --verbose --alter="modify column answer text charset utf8mb4" --exact-rowcount --assume-rbr --postpone-cut-over-flag-file="/tmp/gh-ost-cutover.flag" --max-lag-millis=7500 --assume-master-host="127.0.0.1:4406" --execute

The migration proceeds as expected when copying rows and then waits since the cutover flag file is present. If I then insert the same row again, gh-ost crashes:

insert into test_charset(answer) values('Here’s an apostrophe');

The gh-ost output is attached. It seems that the value of the answer column is coming through as []uint8 instead of string so the character set conversion in types.go/convertArg isn't happening. The "fancy" apostrophe is valid in the latin1 character set.

Any advice is appreciated.

gh-ost.log

scarpenter avatar Jan 09 '20 19:01 scarpenter

I believe this would be addressed by #1003

jbielick avatar Jul 10 '21 00:07 jbielick