bucardo icon indicating copy to clipboard operation
bucardo copied to clipboard

Wide character in subroutine entry -- UTF8 parsing problem?

Open durkie opened this issue 5 years ago • 4 comments

Hi folks -- I've been really grappling with this for quite some time, to the point of creating a fresh Ubuntu 19.04 server, installing everything, and still having trouble with wide characters.

Versions of everything:

Bucardo 5.5.0
MCP Postgres library version: 110005
(15928) [Fri Sep 13 21:45:15 2019] MCP bucardo: /usr/bin/bucardo
(15928) [Fri Sep 13 21:45:15 2019] MCP Bucardo.pm: /usr/share/perl5/Bucardo.pm
(15928) [Fri Sep 13 21:45:15 2019] MCP OS: linux  Perl: /usr/bin/perl 5.28.1
(15928) [Fri Sep 13 21:45:15 2019] MCP DBI version: 1.642  DBD::Pg version: 3.10.0 (31000) DBIx::Safe version: 1.2.5

I'm getting most of my tables synced correctly, but when I add a table that has a varchar column with foreign characters and emoji in it, my bucardo Kids die over and over with

Kid 15965 exiting at cleanup_kid. Sync "rdsdelta" public.users Reason: Wide character in subroutine entry at /usr/share/perl5/Bucardo.pm line 9967. Line: 4997 

The line of interest:

if ('postgres' eq $type) {
***    $Target->{dbh}->pg_putcopydata($buffer); ***
}

This seems like some kind of UTF8 parsing error is my guess, since most tables up to this point have had little text in them. I'm using pretty up-to-date versions of everything too so I feel like I'm out of options at this point except to ask for help. Any ideas what I can do?

durkie avatar Sep 13 '19 21:09 durkie

To add some color to this, Bucardo wouldn't let me sync two other tables because it thought that the master version and replica version of two tables have differed: the master version had a "🚲" icon as the default value for a column, and it claimed the slave version had "δº" as that same default value, even though it was also the "🚲" bicycle emoji.

durkie avatar Sep 14 '19 03:09 durkie

Dang, these keep piling up. New one here:

Warning! Aborting due to exception for public.segment_extents:? Error was DBD::Pg::db pg_putcopyend failed: ERROR:  invalid byte sequence for encoding "UTF8": 0xe9 0x6c 0xe8\nCONTEXT:  COPY segment_extents, line 323 at /usr/share/perl5/Bucardo.pm line 10097.

durkie avatar Sep 14 '19 17:09 durkie

What version of DBD::Pg do you have? I would expect that this has not been well-tested with unicode column names so there are probably some rough edges.

machack666 avatar Sep 14 '19 21:09 machack666

It is 3.10.0 -- but the but to be clear there are no column names with unusual characters: just one with a default value of a bicycle emoji and others containing all variety of user text.

durkie avatar Sep 14 '19 23:09 durkie