Py_Initialize fails with Fatal Python error: config_get_locale_encoding: failed to get the locale encoding: nl_langinfo(CODESET) failed
Bug report
Bug description:
we are calling Py_Initialize and it fails with Fatal Python error: config_get_locale_encoding: failed to get the locale encoding: nl_langinfo(CODESET) failed
here is the my locale
% locale
LANG=""
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
CPython versions tested on:
3.8
Operating systems tested on:
macOS Sonoma
export LANG="en_US.UTF-8"
I'm in macOS Sequoia (Homebrew Python 3.12.4) where a python script is called from a system() command in a C++ program and get the nl_langinfo(CODESET) failure.
as my shell already has the correct locale, my workaround is to run my executable as:
LANG="en_US.UTF-8" /Applications/ES-DE.app/Contents/MacOS/ES-DE
I found what might be some possible insight in a Aug 2020 fork of Python in Gentoo:
UTF-8 became the defacto standard and it's safe to make the assumption that the user expects UTF-8. For example, nl_langinfo(CODESET) can return an empty string on macOS if the LC_CTYPE locale is not supported, and UTF-8 is the default encoding on macOS.
I believe @udance4ever's insights are at the heart of this issue on MacOS.
TL;DR - MacOS can have separate non-empty and empty nl_langinfo(CODESET) in nested python invocations
I've noticed the following: This behavior can still be reproduced with the MacOS Sequoia (e.g., version `15.5 (24F74)` ) with the built-in python (e.g., `Python 3.9.6 (default, Apr 30 2025, 02:07:18) [Clang 17.0.0 (clang-1700.0.13.5)] on darwin`) when calling any pre-3.13 python executable in a venv, including itself (e.g. `v3.9.6` -> `venv` -> invoke cpython`*` interestingly, the python3.9.6 works to call things likevenv and tox (e.g. no issues with the export LANG="en_US.UTF-8" and export LC_CTYPE="en_US.UTF-8" workarounds) but the issue seems to manifest when calling sub-shells from python that then try to invoke another python executable.
I'm not sure if this is a CPython bug (or just a MacOS venv issue), but it may be worth documenting that at-least on darwin systems care must be taken to ensure developers propagate the local (LANG and LC_* shell vars) to system calls (or rather the system call's shell), especially those that will invoke another CPython instance.
The behavior is however, not present for python3.13.5+ when setting the python3.7 PYTHONUTF8=1 and PYTHONCOERCECLOCALE=UTF-8 (e.g., set PYTHONCOERCECLOCALE to non-zero)
Hope this helps someone else save the hours I lost to testing this :shrug:
if it helps any, here is how a developer successfully worked around the issue in his macOS app:
https://gitlab.com/es-de/emulationstation-de/-/commit/4c1c269f903f8c4bd5d24ff0df5a9cc7d9e19385