MINGW-packages icon indicating copy to clipboard operation
MINGW-packages copied to clipboard

unixodbc: get locale charset

Open mmuetzel opened this issue 1 year ago • 14 comments

Query the locale charset (also on Windows) instead of hardcoding the used encoding to GB18030 (see #12892).

The patch is heavily "inspired" by the localcharset module of gnulib.

Edit: For a motivation for these changes, please see the comments here: https://github.com/msys2/MINGW-packages/pull/12892#discussion_r961312024

mmuetzel avatar Sep 01 '22 15:09 mmuetzel

@taozuhong: Were you able to test with the build artifacts from this PR? You can download them from the drop-down list that appears when you click on "Artifacts" in the upper right corner of the "Checks" tab of this PR. Download the respective artifact for your environment, unzip it, and install it with pacman. E.g., for the MINGW64 build artifact, you could install it with pacman -U mingw-w64-x86_64-unixodbc-2.3.11-3-any.pkg.tar.zst.

mmuetzel avatar Sep 05 '22 14:09 mmuetzel

image

image

image

taozuhong avatar Sep 06 '22 01:09 taozuhong

Are these three screenshots showing output before or after the build artifact from this PR was installed? (I can't spot the difference between the output in the different screenshots.) Is that the expected result? If it isn't, which result did you expect?

mmuetzel avatar Sep 06 '22 06:09 mmuetzel

The output is produced by the new version (download from your link), and the main difference is using different ODBC driver paths.

It passed the compilation and linked to the right library, but can't enumerate all drivers that list in the odbcinst.ini.

taozuhong avatar Sep 07 '22 01:09 taozuhong

Is this different to before the change from this PR?

mmuetzel avatar Sep 07 '22 03:09 mmuetzel

I don't know the difference of them, because I didn't test it formerly.

taozuhong avatar Sep 07 '22 08:09 taozuhong

I'm confused. So, which issue did you fix with #12892? And does it re-appear with this change?

mmuetzel avatar Sep 07 '22 10:09 mmuetzel

  1. My app link with Windows built-in ODBC component, found encoding issue.
  2. Then search solution on the internet to fix it, found UnixODBC has the same kind issue
  3. So I submitted a PR for UnixODBC to fix it.
  4. The solution to fix the issue that I met, add CHARSET=utf8mb4; to the mysql connection string.

taozuhong avatar Sep 07 '22 13:09 taozuhong

Unfortunately, I can't help in setting up unixodbc either.

As maybe the next best thing, could you please check if this stand-alone program determines the correct charset for you? test_locale.zip

Please, un-zip the source file and compile and test it with:

gcc test_locale.c -otest_locale.exe
./test_locale.exe

Could you please show the output of these commands in a MINGW64 shell?

If it determines the correct charset, we can probably assume that the changes in this PR would do the correct thing for you, too.

mmuetzel avatar Sep 07 '22 15:09 mmuetzel

@Biswa96: Do you know how to test this package?

mmuetzel avatar Sep 07 '22 15:09 mmuetzel

I am not familiar with the project. Others may help.

Biswa96 avatar Sep 07 '22 15:09 Biswa96

$ ./test_locale.exe
current_locale = C
pdot = (null)
buf = CP936
codeset (before lookup table) = CP936
codeset (after lookup table) = GBK

taozuhong avatar Sep 08 '22 00:09 taozuhong

Thank you. Well, that's not exactly the charset that you hardcoded in the other PR. But it is an Asian charset pretty similar to it afaict.

I guess that means that the locale detection would be working with this change. I'll try to propose it upstream.

In any case, it would be better to have this instead of the currently hardcoded charset imho. So, I hope this can be merged here.

@Biswa96 or anyone that can help: Do you know how to reach out to users of this package?

Edit: IIUC, the "real" fix for this is for downstream users to define their encoding in the connection string. In that case, reverting the other PR might also be a (inferior) solution. Imho, it would be less surprising for an end user if packages from this repository default to a Western encoding (instead of an Asian encoding). (After all, the available documentation is mainly in English...)

mmuetzel avatar Sep 08 '22 06:09 mmuetzel

The patch was accepted upstream. I updated the PR to get the patch from there.

mmuetzel avatar Sep 08 '22 09:09 mmuetzel