node-informix icon indicating copy to clipboard operation
node-informix copied to clipboard

Locale support

Open leosurban opened this issue 8 years ago • 22 comments

Hi, I have trouble with our locales. We are using czech locale for Central Europe, CP1250 on Windows, with PC-LATIN-2 on DOS command window.

So my settings is: SET DB_LOCALE=cs_CZ.cp1250 SET CLIENT_LOCALE=cs_CZ.pc-latin-2

This leads to correct result:

echo SELECT nazevmater FROM cenmat | dbaccess mydb All characters are correctly displayed, including diactritic, see screenshot.

Same setting, same window:

node ixtest.js

var opts = {
  database : 'mydb@ixlur',
  username : 'informix',
  password : '***'
}; 
var informix = require( 'informix' )( opts );
informix
  .query( "select nazevmater from cenmat" )
  .then( function ( cursor ) 
  {
    return  cursor.fetchAll();
    
  } )
  .then( function ( results )
  {
     console.log( 'results:', results );
  })
  .catch( function ( err ) 
  {
    console.log( err );
  } );

It is working but diacritic characters are missing (? is displayed, see screenshot).

My database is encoded as CP1250, so DB_LOCALE is correct. In our Windows application with CP1250 is all ok, in DBACCESS, with pc-latin-2, too.

With node-informix I tried CLIENT_LOCALE set to cs_CZ.cp1250, cs_CZ.pc-latin-2, cs_CZ.UTF8, result same - ? characters.

But your application (ESQL) read locales, because when I set wrong locale: SET CLIENT_LOCALE=cs_CZ.bad then error is displayed:

node ixtest.js Error: [-23101] Unable to load locale categories. at Error (native)

python -V Python 2.7.13

node -v v6.7.0

npm list informix node-api@ C:\buffer\jsrest `-- [email protected]

node-informix-locales1 node-informix-locales2

leosurban avatar Feb 17 '17 08:02 leosurban

@leosurban, can you give a sample SQL script to create a DB and a table with some data for me to test this out (similar to test/support/test-db.sql)?

Most likely this is due to ASCII/UTF-8 conversion issues 😃.

uatuko avatar Feb 18 '17 02:02 uatuko

Command script for Windows (for dbimport, dbexport, dbaccess, ...) ixlur.cmd: Important for LOCALE is DB_LOCALE and CLIENT_LOCALE. Last command is setting console locale for switch window to PC-LATIN-2 (=code page 852).

set INFORMIXDIR=C:\PROGRA~1\IBMINF~1
set REGMACHINE=\\LUR-PC
set INFORMIXSERVER=ixlur
set ONCONFIG=onconfig.ixlur
set INFORMIXSQLHOSTS=C:\Program Files\IBM Informix Software Bundle\etc\sqlhosts.ixlur
set GL_USEGLU=1
set PATH=%INFORMIXDIR%\bin;%PATH%
set DB_LOCALE=cs_CZ.cp1250
set CLIENT_LOCALE=cs_CZ.pc-latin-2
set DBDATE=dmy4.
set DBMONEY=.Kc
mode con codepage select=852

LocaleTest.exp.zip

unzip LocaleTest.zip

dbimport LocaleTest

echo select somename from exampletable | dbaccess LocaleTest

Database selected.
somename  Pink earthworm
somename  Růžová žížala
somename  Beehive
somename  Včelí úl
4 row(s) retrieved.
Database closed.

To be sure I attached screenshot with correct result: locale_1

node ixtest.js

node ixtest.js

[ [ 1, 'Pink earthworm', 42783 ],
  [ 2, 'R??ov? ???ala', 42783 ],
  [ 3, 'Beehive', 42783 ],
  [ 4, 'V?el? ?l', 42783 ] ]

// note: I replaced invalid characters with ?, see screenshot.

locale_2

Javascript Node/Code:

var opts = {
  database : 'LocaleTest@ixlur',
  username : 'informix',
  password : '***'
};
var informix = require( 'informix' )( opts );
informix
  .query( "select id, somename, somedate from exampletable" )
  .then( function ( cursor ) 
  {
    return  cursor.fetchAll();    
  } )
  .then( function ( results )
  {
   console.log(results);
  })
  .catch( function ( err ) 
  {
    console.log( err );
  } );

If you will have some problem to simulate it (I do not know if you are using Windows, if your computer has support for EastEurope codesets etc..) I can prepare IX server with this database, with communication port opened to internet. I can also prepare similar setting for client on Unix/Linux OS (ISO-LATIN-2 code set).

Thanks.

leosurban avatar Feb 18 '17 08:02 leosurban

And ODT file (Zipped) because with Unicode encoded contetent may be better way than UNL single-codeset file. See end of document for INSERT statements.

LocaleTest.odt.zip .

leosurban avatar Feb 18 '17 09:02 leosurban

It seems to work OK with UTF-8.

Using dbaccess (data from LocaleTest.odt):

create table exampletable
(
  id integer not null,
 somename nvarchar(100) not null,
  somedate date not null,
  primary key(id)
  somename nvarchar(100) not null,
  somedate date not null,
  primary key(id)
> );

Table created.

insert into exampletable values(1,"Pink earthworm", today);

1 row(s) inserted.

insert into exampletable values(2,"Růžová žížala", today);

1 row(s) inserted.

insert into exampletable values(3,"Beehive", today);

1 row(s) inserted.

insert into exampletable values(4,"Včelí úl", today);

1 row(s) inserted.

> select * from exampletable;



id        1
somename  Pink earthworm
somedate  02/18/2017

id        2
somename  Růžová žížala
somedate  02/18/2017

id        3
somename  Beehive
somedate  02/18/2017

id        4
somename  Včelí úl
somedate  02/18/2017

4 row(s) retrieved.

> 

Using Node.js:

informix                                                                                                                                                                                                                                                                                                                      
    .query( 'select * from exampletable' )                                                                                                                                                                                                                                                                                    
    .then( ( cursor ) => {                                                                                                                                                                                                                                                                                                    
        return cursor.fetchAll( { close : true } );                                                                                                                                                                                                                                                                           
    } )                                                                                                                                                                                                                                                                                                                       
    .then( ( results ) => {                                                                                                                                                                                                                                                                                                   
        console.log( results );                                                                                                                                                                                                                                                                                               
    } )                                                                                                                                                                                                                                                                                                                       
    .catch( ( err ) => {                                                                                                                                                                                                                                                                                                      
        console.log( err );                                                                                                                                                                                                                                                                                                   
    } );   
$ node tmp/test.js 
[ [ 1, 'Pink earthworm', '2017-02-18T00:00:00.000Z' ],
  [ 2, 'Růžová žížala', '2017-02-18T00:00:00.000Z' ],
  [ 3, 'Beehive', '2017-02-18T00:00:00.000Z' ],
  [ 4, 'Včelí úl', '2017-02-18T00:00:00.000Z' ] ]

Here's a UTF-8 data unload for reference: exampletable.txt

uatuko avatar Feb 18 '17 12:02 uatuko

@leosurban, can you set the windows command prompt and CLIENT_LOCAL to utf-8 and check whether it resolves the issue?

Although ESQL/C support CLIENT_LOCAL when retrieving data, within Node.js all strings are treated as utf-8 so CLIENT_LOCAL should be set to utf-8. But there's no conversion from utf-8 to PC-LATIN-2 which I believe is why you are seeing encoding issues on command prompt.

Either setting command prompt to utf-8 or using a conversion library for display purposes should resolve your issue.

uatuko avatar Feb 18 '17 13:02 uatuko

Sorry, no, it is not working.

mode con codepage select=65001 set CLIENT_LOCALE=cs_CZ.UTF8 echo select somename from exampletable | dbaccess LocaleTest … correct result …

node ixtest.js … bad result …

What is setting of your Linux (?) terminal when you receive this (correct) result?

leosurban avatar Feb 18 '17 13:02 leosurban

locale_3 locale_4

leosurban avatar Feb 18 '17 13:02 leosurban

I'm using Linux UTF-8.

A quick search seem to suggest you'll need Lucida Console font as well for UTF-8 support on windows command prompt which I assume you are using?

On 18 Feb 2017, at 13:37, leosurban [email protected] wrote:

Sorry, no, it is not working.

mode con codepage select=65001 set CLIENT_LOCALE=cs_CZ.UTF8 echo select somename from exampletable | dbaccess LocaleTest … correct result …

node ixtest.js … bad result …

What is setting of your Linux (?) terminal when you receive this (correct) result?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub, or mute the thread.

uatuko avatar Feb 18 '17 14:02 uatuko

With Lucida Console font is the result the same. And because DBACCESS result with UTF-8 setting is correct (see screenshot), I mean that UTF8 is displayed correctly. locale_5

I tried also your exampletable.txt - correct. locale_6

Only result from ESQLC/JS is wrong :(

leosurban avatar Feb 18 '17 17:02 leosurban

Looks like there's a conversion issue somewhere on windows. Don't have access to a windows PC now to debug this but will try to take a look later this week.

uatuko avatar Feb 18 '17 18:02 uatuko

Probably. I created small test with important result.

var opts = {
  database : 'LocaleTest@ixlur',
  username : 'informix',
  password : '***'
}; 
var informix = require( 'informix' )( opts );
var moment = require('moment');
{
  //build string "Ruzova zizala" with diacritic with unicode characters
  console.log("Static string with diactritic:");
  var str = "R"+String.fromCharCode(367)+String.fromCharCode(382)+"ov"+String.fromCharCode(225)
           +" "
           +String.fromCharCode(382)+String.fromCharCode(237)+String.fromCharCode(382)+"ala"; 
  for (var i=0; i<str.length; i++)
  {
    console.log( str[i] + '...('+ str.charCodeAt(i) + ')' );
  }
}
// now retrieve tha same string from ESQLC
informix
  .query( "select somename from exampletable where id=2" )
  .then( function ( cursor ) 
  {
    return  cursor.fetchAll();    
  } )
  .then( function ( results )
  {
    // display this "static" string...
    console.log("The same string from SQL result:");
    var str = results[0][0];
    for (var i=0; i<str.length; i++)
    {
      console.log( str[i] + '...('+ str.charCodeAt(i) + ')' );
    }
  })
  .catch( function ( err ) 
  {
    console.log( err );
  } );

Result: locale_7

What is important:

  1. Unicode string is correctly
  2. that CLIENT_LOCALE does not matter. All variants (cs_CZ.UTF-8, cs_CZ.cp1250 and cs_CZ.pc-latin-2) return the same values, for diacritic characters return value 65533.

Thank you. L.

leosurban avatar Feb 18 '17 19:02 leosurban

Tested this on a Windows 7 VM and encoding works fine.

Node.js command prompt:

ConEmu shell:

However my DB is in en_us.819 (default?).

> select * from systables where tabid = 90;



tabname           GL_COLLATE
owner            
partnum          0
tabid            90
rowsize          0
ncols            0
nindexes         0
nrows            0.00
created          02/18/2017
version          0
tabtype          
locklevel        
npused           0.00
fextsize         0
nextsize         0
flags            0
site             en_US.819
dbname           
type_xid         0
am_id            0
pagesize         0
ustlowts         
secpolicyid      0
protgranularity  
statchange       0
statlevel        

1 row(s) retrieved.

uatuko avatar Feb 18 '17 22:02 uatuko

So need to check,

  • [x] Is this an issue with DB_LOCALE being cs_CZ.cp1250?
    • cs_CZ.UTF8 seems to work, most likely this issue is related to some additional features presented in cs_CZ.cp1250 (see #issuecomment-280939778).
  • [ ] ~~Is this an issue with Windows 8/10~~?
  • [x] Can we reproduce this in Linux/Mac?

uatuko avatar Feb 18 '17 22:02 uatuko

However my DB is in en_us.819 (default?).

This is wrong and probably reason, why it is working for you. You must have some czech locale on database side to have correct collation (ordering, lower/uppercase converting etc). I can be any cs_CZ locale ( cs_CZ.cp1250, cs_CZ.pc-latin-2, cs_CZ.8859-2, probable cs_CZ.UTF8 too). Your locale en_US.819 is 8859-1, standard ASCII.

This will not work in your database: locale_8

You need set DB_LOCALE before you import/create database. It will not change nothing in your current databases. Only this one will have this settings.

I am sure, that this is not problem with "czech" locale and I can prepare the same example for russian locales and CP1251. I used czech locale because I need it and for you is better to test because there is mix of diacritic and "normal" characters.

echo select * from systables where tabid = 90 | dbaccess LocaleTest tabname GL_COLLATE owner partnum 0 tabid 90 rowsize 0 ncols 0 nindexes 0 nrows 0.00 created 18.02.2017 version 0 tabtype locklevel npused 0.00 fextsize 0 nextsize 0 flags 0 site cs_CZ.1250 dbname type_xid 0 am_id 0 pagesize 0 ustlowts secpolicyid 0 protgranularity statchange 0 statlevel 1 row(s) retrieved.

leosurban avatar Feb 19 '17 06:02 leosurban

Took a little bit of time to update my Informix images to include extended GLS support but finally managed to reproduce this issue on Linux 😃.

  1. DB_LOCALE=cs_CZ.UTF8
> create database gls with buffered log;

Database created.

> select * from systables where tabid = 90;



tabname           GL_COLLATE
owner            
partnum          0
tabid            90
rowsize          0
ncols            0
nindexes         0
nrows            0.00
created          19/02/17
version          0
tabtype          
locklevel        
npused           0.00
fextsize         0
nextsize         0
flags            0
site             cs_CZ.57372
dbname           
type_xid         0
am_id            0
pagesize         0
ustlowts         
secpolicyid      0
protgranularity  
statchange       0
statlevel        

1 row(s) retrieved.

If I don't set CLIENT_LOCALE, both dbaccess and Node.js will display invalid characters but setting CLIENT_LOCALE=en_gb.utf8 (or to cs_CZ.UTF8) will solve both dbaccess and Node.js display issues.

dbaccess

node.js

uatuko avatar Feb 19 '17 18:02 uatuko

I can't get cs_CZ.cp1250, cs_CZ.pc-latin-2, cs_CZ.8859-2 locales to work. They throw -23101/-23104 errors on my 12.10 FC8DE installation (client SDK 4.10 FC8DE) even through I selected all available GLS components during installation.

@leosurban, If you have a DB that I can access I can look into this more (probably will need a client SDK that supports your locales as well) but it does seem to suggest a settings mismatch is the likely culprit.

For reference, you can use the docker images I used (docker pull nukedzn/informix:12.10.FC8DE) to validate my results.

uatuko avatar Feb 19 '17 19:02 uatuko

@uditha-atukorala Please send me mail to **** at kolin.cz to get access to test server. Thanks.

(UA) Edit: Mask email.

leosurban avatar Feb 20 '17 17:02 leosurban

Looking at $INFORMIXDIR/gls/lc11/os, on Linux/Mac there aren't any OS specific files.

$ ls $INFORMIXDIR/gls/lc11/os
C  C.lc  POSIX  POSIX.lc  _readme
$ head -15 $INFORMIXDIR/gls/lc11/os/POSIX.lc 
##########################################################################
#                                                                        #
#  Licensed Materials - Property of IBM                                  #
#                                                                        #
#  "Restricted Materials of IBM"                                         #
#                                                                        #
#  IBM Informix GLS                                                      #
#  (c) Copyright IBM Corporation 1996, 2004 All rights reserved.         #
#                                                                        #
##########################################################################
<operating_system> "portable" 
<source_version>		100
<modified_date>		"05-04-2004"

On Windows there are a bunch of files which seems to be OS specific.

e.g. czech.lc

##########################################################################
#                                                                        #
#  Licensed Materials - Property of IBM                                  #
#                                                                        #
#  "Restricted Materials of IBM"                                         #
#                                                                        #
#  IBM Informix GLS                                                      #
#  (c) Copyright IBM Corporation 1996, 2004 All rights reserved.         #
#                                                                        #
##########################################################################
<operating_system>  "winnt351"
<source_version>                100
<modified_date>         "05-04-2004"

uatuko avatar Feb 26 '17 10:02 uatuko

Hello,

I need to use node-informix for a new project and I'm having issues with locales.

The following configuration works well in PHP:

  1. Database in en_US.819
  2. Client connection with option "DB_LOCALE=en_US.819"
  3. Queries results converted successfully using php function "uft8_encode()"

Using node-informix I can't convert query results to UTF-8 correctly. I have tried:

  1. Buffer.from(value, 'latin1').toString('utf-8');
  2. NPM package "utf8", function "encode"

Characters like the "MASCULINE ORDINAL INDICATOR" and latin accents appear wrong.

Example String: Drtº Here is what it looks like in the console: image

Here here what it looks like in the browser with "meta charset=utf-8": image

Would it be possible to pass the LC_LOCALE param to the connection options?

Any help is appreciated. Thanks.

taviroquai avatar Sep 19 '19 16:09 taviroquai

Hello,

My previous comment looks it is solved for now with the following environment variables: CLIENT_LOCALE="en_US.utf8" DB_LOCALE="en_US.819"

taviroquai avatar Sep 23 '19 11:09 taviroquai

Hi, did you ever solve this cs_CZ.cp1250 locale? I have exactly the same problem like @leosurban and can't find any solution. Thanks

ctahk0 avatar Sep 24 '19 12:09 ctahk0

Unfortunately not, I didn't have a client SDK with the necessary locale to test this fully.

If at all possible, using a UTF8 variant seem to solve the issue (#issuecomment-280939778).

uatuko avatar Sep 24 '19 12:09 uatuko