yuniql
yuniql copied to clipboard
treatment of non-ascii characters
It looks like yuniql replaces non-ascii characters in my script with '�'. Example: select regexp_replace(asset_name, '[–—]', '-', 'g') as name from xxx becomes select regexp_replace(asset_name, '[��]', '-', 'g') as name from xxx
The same script executed with psql or dbeaver produces correct representation of the characters.
Thoughts?
Hi @avengerovsky , thanks for reaching out and you're probably right about this. I will add some test to cover this and fix will possibly be in the next release. Im really hoping to release this week if time permits.
P.S. ICYMI, please Star our repo. Thanks!
@avengerovsky , I tried to reproduce this and I think that while it prints incorrectly on console, it reads the content of the script files correctly. Here I have an script file with this script and I can see the Chinese characters are well preserved when its inserted into database.
There is problem in Console and I tried looking and it seems to have somethign to do with the Console settings than the code itself.
SELECT '苹果 (Píngguǒ)' AS Apple UNION ALL
SELECT '微软 (Wēiruǎn)' AS Microsoft UNION ALL
SELECT '三星 (Sānxīng)' AS Samsung
GO
CREATE TABLE [dbo].[test_utf16_table](
[textdata] [nvarchar](MAX) NOT NULL
);
GO
INSERT INTO [dbo].[test_utf16_table] VALUES (N'苹果 (Píngguǒ)')
INSERT INTO [dbo].[test_utf16_table] VALUES (N'微软 (Wēiruǎn)')
INSERT INTO [dbo].[test_utf16_table] VALUES (N'三星 (Sānxīng)')
GO
@avengerovsky , I tried to reproduce this and I think that while it prints incorrectly on console, it reads the content of the script files correctly. Here I have an script file with this script and I can see the Chinese characters are well preserved when its inserted into database.
There is problem in Console and I tried looking and it seems to have somethign to do with the Console settings than the code itself.
SELECT '苹果 (Píngguǒ)' AS Apple UNION ALL
SELECT '微软 (Wēiruǎn)' AS Microsoft UNION ALL
SELECT '三星 (Sānxīng)' AS Samsung
GO
CREATE TABLE [dbo].[test_utf16_table](
[textdata] [nvarchar](MAX) NOT NULL
);
GO
INSERT INTO [dbo].[test_utf16_table] VALUES (N'苹果 (Píngguǒ)')
INSERT INTO [dbo].[test_utf16_table] VALUES (N'微软 (Wēiruǎn)')
INSERT INTO [dbo].[test_utf16_table] VALUES (N'三星 (Sānxīng)')
GO
The log files also capture the text correctly. Can you send me a test script file I can use to reproduce the issue. Thanks. @avengerovsky
Thank you for looking into this issue. Here is how my log looks like:
the original sql is:
Now, I found a workaround by using doublebyte digital representation of these characters (en dash and em dash) and it is working fine.
Still I think the problem is there...
Thanks, I will try to reproduce this on pgsql. The one I tested so far is on sqlserver. And good to hear you found a work around and still convinced to use yuniql :)
It should ideally work with pgdump and with out customization as you did. We'll try to do more investigation. Thanks again.