msys2-runtime
msys2-runtime copied to clipboard
Multi-byte characters not rendered correctly when output from a C or C++ program
If a C or C++ program writes multi-byte characters to the console, they are not rendered correctly. The following shell script demonstrates the same.
#! /usr/bin/env sh
pacman -S --needed mingw-w64-ucrt-x86_64-gcc mingw-w64-ucrt-x86_64-python
printf '#include <stdio.h>\nint main(void) { puts("∈√≈≡⊥"); }\n' >msys2.c
gcc msys2.c
./a
echo $(./a)
printf '#include <cstdio>\nint main(void) { std::puts("∈√≈≡⊥"); }\n' >msys2.cc
g++ msys2.cc
./a
echo $(./a)
printf 'import sys\n\nprint("∈√≈≡⊥")' >msys2.py
python msys2.py
echo "∈√≈≡⊥"
Here's the output.
warning: mingw-w64-ucrt-x86_64-gcc-14.1.0-3 is up to date -- skipping
warning: mingw-w64-ucrt-x86_64-python-3.11.9-1 is up to date -- skipping
there is nothing to do
ΓêêΓêÜΓëêΓëÃ⊥
∈√≈≡⊥
ΓêêΓêÜΓëêΓëÃ⊥
∈√≈≡⊥
∈√≈≡⊥
∈√≈≡⊥
Key Observations
- When multi-byte characters are written by a C or C++ program, the actual characters written don't appear to be related, and can themselves by multi-byte.
- If the output is saved to a variable and then
echo
ed (seeecho $(./a)
above), the characters are displayed correctly. - Upon writing multi-byte characters from Python or sh, nothing unexpected occurs.
I am using 64-bit MSYS2 20230526. I didn't try this with the latest version because I didn't find any bug reports for this issue even after searching for a while.