tds_fdw icon indicating copy to clipboard operation
tds_fdw copied to clipboard

ERROR: Unconverted bytes were changed to question marks

Open denishpatel opened this issue 7 years ago • 14 comments

Hello,

When I try to use Postgres 9.6 tds_fdw using UTF-8 character set... it's throwing error for a couple of tables...

test=# select * from test_fdw.labels; NOTICE: DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed database context to 'test'., Server: WIN2K12-DEV, Process: , Line: 1, Level: 0 NOTICE: DB-Library notice: Msg #: 5703, Msg state: 1, Msg: Changed language setting to us_english., Server: WIN2K12-DEV, Process: , Line: 1, Level: 0 ERROR: DB-Library error: DB #: 2403, DB Msg: Some character(s) could not be converted into client's character set. Unconverted bytes were changed to question marks ('?'), OS #: 0, OS Msg: Success, Level: 4

I verified all the things listed in https://github.com/tds-fdw/tds_fdw/issues/1

Can you help ?

denishpatel avatar Dec 10 '16 18:12 denishpatel

Can you post the table definition please? What have you tried from issue #1?

Smtgr14 avatar Dec 29 '16 15:12 Smtgr14

I have also encountered this.

AndrewGrossman avatar Mar 07 '19 18:03 AndrewGrossman

I have also encountered this.

Interestingly, the issue seems to have resolved when I locally built FreeTDS off of the current master, then built tds-fdw again. OSX Mojave, FreeTDS originally from brew.

AndrewGrossman avatar Apr 20 '19 21:04 AndrewGrossman

Interestingly, the issue seems to have resolved when I locally built FreeTDS off of the current master, then built tds-fdw again. OSX Mojave, FreeTDS originally from brew.

What was the FreeTDS version you downloaded from brew?

juliogonzalez avatar Apr 20 '19 23:04 juliogonzalez

I have this issue on CentOS 7 with freetds version 0.95.81 from EPEL and postgresql-10-tds_fdw.x86_64 ver 2.0.0-alpha.3.2.el7 (from repo: https://tds-fdw.github.io/yum/). SQL Server 2008R2 has database with Collation Serbian_Latin_100 which seams to be CP1250. In /etc/freetds.conf I have only: tds version = 7.3 client charset = UTF-8 I created 2 databases in PostgreSQL: First "izp" with Encoding:"UTF8" (SHOW client_encoding shows UNICODE), and Character type and Collation:"en_US.UTF-8". For varchar(250) string "Vojvođanska banka" PostgreSQL with shows "Vojvoрanska banka a.d." (using PgAdmin4) Second "izpmssql" with Encoding:"WIN1250" ( SHOW client_encoding shows WIN1250), and Character type and Collation:"C". For varchar(250) string "Vojvođanska banka" PostgreSQL with shows "VojvoŃ€anska banka a.d.

If I change to client charset = CP1250 both databases throw error #2403

I have also seen that FreeTDS is now version 1.1 (https://fossies.org/diffs/freetds/1.1_vs_1.1.1/ChangeLog-diff.html) and it shows that there was a fix: "Fix some collation encoding detection"

So my question is if versions I use should be able to show data correctly, or should I try to get my hands on newer versions of freetds and/or tds_fdw?

DrLove73 avatar Jun 18 '19 15:06 DrLove73

OK, It turns out everything is good. I used first database with UTF8 encoding and I dropped then recreated all foreign tables:

IMPORT FOREIGN SCHEMA izp
	EXCEPT (racun, tipKomitenta)
	FROM SERVER izpwin2012
	INTO mssql
	OPTIONS (import_default 'true');
IMPORT FOREIGN SCHEMA izp
	LIMIT to (racun, tipKomitenta)
	FROM SERVER izpwin2012
	INTO mssql
	OPTIONS (import_default 'false');
IMPORT FOREIGN SCHEMA sigurnost
	FROM SERVER izpwin2012
	INTO mssql
	OPTIONS (import_default 'true');

It turns out only PgAdmin4 does not see those characters, but Lazarus with ZEOS components shows them properly. So I consider this non-issue.

DrLove73 avatar Jun 19 '19 09:06 DrLove73

Sounds like you got it, can you close this issue? @DrLove73

SudoerWithAnOpinion avatar Jun 19 '19 22:06 SudoerWithAnOpinion

Yes, you can.

DrLove73 avatar Jun 20 '19 05:06 DrLove73

I was wrong, this is only a partialy good. letter "đ" is shown properly, but "č" and "ž" are not shown properly :-( I will check other special letters from Serbian tomorrow.

DrLove73 avatar Jun 20 '19 16:06 DrLove73

Ok, I conducted a thorough testing of every imaginable variant of data component in Lazarus, but I get same data as from PgAdmin4. I got codes from MSSQL using following SQL:

SET TEXTSIZE 0;  
-- Create variables for the character string and for the current   
-- position in the string.  
DECLARE @position int, @string char(8);  
-- Initialize the current position and the string variables.  
SET @position = 1;  
SET @string = 'ŠĐŽČĆšđžčć';  
WHILE @position <= DATALENGTH(@string)
   BEGIN  
   SELECT ASCII(SUBSTRING(@string, @position, 1)) as ASCII, NCHAR(UNICODE(SUBSTRING(@string, @position, 1))) as Letter,   
      UNICODE(SUBSTRING(@string, @position, 1)) as Unicode
   SET @position = @position + 1  
   END;  
GO  

Shown by MSSQL (2008R2) with "Serbian Latin"->FreeTDS (0.95.81 from EPEL)->TDS_FDW (2.0.0-alpha.3.2.el7)->PgAdmin4: Shown = Should be = MSSQL ASCII() = MSSQL Unicode():

Љ=Š=138=352
Р=Đ=208=272
Ћ=Ž=142=381
И=Č=200=268
Ж=Ć=198=262
љ=š=154=353
р=đ=240=273
ћ=ž=158=382
и=č=232=269
ж=ć=230=263

It seams that is FreeTDS problem since if I use (on Windows) dblib.dll (freetds v1.00.11) via Lazarus MSSQLConn, same characters are shown. If I try to use (on Windows) any newer sybdb.dll (FreeTDS 1.2.x) as dblib.dll that MSSQLConn demands, I get error:

The USE database statement failed because the database collation Serbian_Latin_100_CI_AI is not recognized by older client drivers. Try upgrading the client operating system or applying a service update to the database client software, or use a different collation. See SQL Server Books Online for more information on changing collations.

Any help or direction would be helpfull. Is there any way tds_fdw could convert them on the fly or will I have to add code to my app to avoid this veritable horror I have in attempt to move away from old implementation?

DrLove73 avatar Jun 25 '19 15:06 DrLove73

Can you try updating FreeTDS? In my experience FreeTDS is very picky.

SudoerWithAnOpinion avatar Jun 25 '19 16:06 SudoerWithAnOpinion

I updated last part of my previous comment to reflect I used FreeTDS 1.2.x (last build) on Windows (dll from AppVeyor https://ci.appveyor.com/project/FreeTDS/freetds/builds/25496715/job/398k4ka0iesqxc1k/artifacts), I get that "The USE database" error.

I will have to see how to compile rpm for EL7 with latest version to install it on my devel laptop.

DrLove73 avatar Jun 26 '19 06:06 DrLove73

Ok, IT WORKS! I fired up my rpm rebuilding skills and I managed to update EPEL's src.rpm to version 1.1.6 (replaced tar.bz2, and in spec file changed version number and --with-tdsver="7.3" from --with-tdsver="4.2" ). Since there are no rpm's for CentOS 7, I am publishing rpm and src.rpm files in my old repo: http://centos.plnet.rs/mrepo/plnet-centos7-x86_64/RPMS.plnet-compiled/ and http://centos.plnet.rs/mrepo/plnet-centos7-srpms-srpms/RPMS.plnet-compiled/ BE aware that files are NOT YET SIGNED! I have yet to find my old keys and even see if they are still valid. I will do this at later time. Serbian letters are properly shown in PgAdmin4 and Lazarus via ZEOS components.

DrLove73 avatar Jun 26 '19 12:06 DrLove73

Tds_fdw developers: If providing link to binary packages is against some rules, you may remove them, or I will if you say so. I am long-time CentOS contributor (in mailing lists since ~2009) and I am admin of CentOS Facebook group so rpm's are clean of malware, built from unchanged sources.

DrLove73 avatar Jun 26 '19 14:06 DrLove73