terminal
terminal copied to clipboard
ReadConsoleOutputCharacterW behavior change in 1.22
Windows Terminal version
1.22.10731.0
Windows build number
10.0.26100.3476
Other Software
Here is sample code that writes a line of text to the console using WriteConsoleW, reads the line back using ReadConsoleOutputCharacterW, and then compares the content.
example.cxx (click to expand)
#include <cstring>
#include <iomanip>
#include <iostream>
#include <vector>
#include <wchar.h>
#include <windows.h>
int main()
{
std::wstring const text =
L"\u092F\u0942\u0928\u093F\u0915\u094B\u0921 " // Hindi
L"\u03B5\u03AF\u03BD \u03B1\u03B9 " // Greek
L"\u0437\u0434\u043E\u0440\u043E\u0432\u043E!" // Russian
;
// Write a line of text to the console.
HANDLE hOut = GetStdHandle(STD_OUTPUT_HANDLE);
WriteConsoleW(hOut, text.data(), text.size(), nullptr, nullptr);
WriteConsoleW(hOut, L"\n", 1, nullptr, nullptr);
// Read the line of text back from the console.
std::vector<wchar_t> received;
{
CONSOLE_SCREEN_BUFFER_INFO screenBufferInfo;
if (!GetConsoleScreenBufferInfo(hOut, &screenBufferInfo)) {
std::cerr << "GetConsoleScreenBufferInfo failed\n";
return 1;
}
DWORD width = screenBufferInfo.dwSize.X;
received.resize(width);
COORD coord{ 0, screenBufferInfo.dwCursorPosition.Y - 1 };
DWORD charsRead = 0;
if (!ReadConsoleOutputCharacterW(hOut, received.data(), width, coord,
&charsRead) ||
charsRead == 0) {
std::cerr << "ReadConsoleOutputCharacterW failed\n";
return 1;
}
}
// Compare the line we read to the line we wrote.
if (std::memcmp(received.data(), text.data(),
text.size() * sizeof(wchar_t)) == 0) {
std::cerr << "Console has expected content" << std::endl;
} else {
std::cerr << "Expected output | Received output" << std::endl;
for (size_t i = 0; i < text.size(); i++) {
std::cerr << std::setbase(16) << std::setfill('0') << " "
<< "0x" << std::setw(8) << static_cast<unsigned int>(text[i])
<< " | "
<< "0x" << std::setw(8)
<< static_cast<unsigned int>(received[i]);
if (static_cast<unsigned int>(text[i]) !=
static_cast<unsigned int>(received[i])) {
std::cerr << " MISMATCH!";
}
std::cerr << std::endl;
}
std::cerr << std::endl;
return 1;
}
return 0;
}
Steps to reproduce
Compile the above example.cxx sample code and run it in a Windows Terminal.
>cl -EHsc example.cxx
>example
Expected Behavior
ReadConsoleOutputCharacterW recovers what WriteConsoleW wrote, as it did in Windows Terminal 1.21 and always has in Windows Console Host:
>example
यूनिकोड είν αι здорово!
Console has expected content
Actual Behavior
ReadConsoleOutputCharacterW receives text partially replaced by 0xFFFD replacement characters.
>example
यूनिकोड είν αι здорово!
Expected output | Received output
0x0000092f | 0x0000fffd MISMATCH!
0x00000942 | 0x0000fffd MISMATCH!
0x00000928 | 0x0000fffd MISMATCH!
0x0000093f | 0x00000921 MISMATCH!
0x00000915 | 0x00000020 MISMATCH!
0x0000094b | 0x000003b5 MISMATCH!
0x00000921 | 0x000003af MISMATCH!
0x00000020 | 0x000003bd MISMATCH!
0x000003b5 | 0x00000020 MISMATCH!
0x000003af | 0x000003b1 MISMATCH!
0x000003bd | 0x000003b9 MISMATCH!
0x00000020 | 0x00000020
...
I also built Windows Terminal from source and ran git bisect. The behavior change was introduced by #16916.
Thanks so much for the comprehensive repro.
This is one of those thorny issues where we're trying to move the platform forward that comes at the cost of some backwards compatibility.
With the release of 1.22 and the switch to using grapheme clusters by default, combining characters (or grapheme bases which require additional characters, or... (there's a lot of cases here)) like U+93F and U+942 and U+94B can no longer be inserted into individual cells (or CHAR_INFO) during streaming text output.
This is one of the ways in which the Windows Console APIs were never sufficient for use with languages other than those which use the Latin alphabet and some limited CJK.
If you have an application that requires strict compatibility with the original one-narrow-character-per-cell measurements offered by the console, you can configure the measurement mode Terminal uses in the Compatibility settings.
you can configure the measurement mode Terminal uses in the Compatibility settings.
Is there some capability that applications can use to detect this (e.g., to have an accurate wcwidth implementation)?
Yes, absolutely! If you set console mode ENABLE_VIRTUAL_TERMINAL_PROCESSING and emit a request for DEC private mode 2027 (DECRQM 2027 "grapheme cluster support"):
\e [ ? 2 0 2 7 $ p
you will get a VT-encoded response (DECRPM) indicating whether it is permanently set / enabled (3) or permanently reset / disabled (4).
The full exchange looks something like this:
TERMINAL || APPLICATION
<- \e[?2027$p
\e[?2027;3$y ->
If you get a reply indicating 4 or you do not get a reply, the console is in traditional/Windows measurement mode. If you get a reply indicating 3, the console is in grapheme cluster measurement mode.
The "Unix/wcswidth" measurement mode is somewhat of an outlier here, and I don't have a good answer for how to detect it. It's not the default or the backwards-compatible option so we expect users to only use it when they have an explicit need.
We don't have another both extensible and backwards-compatible mechanism for signaling console state, so right now VT is the best we can offer. Sorry about that.
@DHowett thanks for the explanation! I mainly wanted to make sure this was not unintended, and if the change is intentional then I'm fine with closing this.
In my real use case, our application only writes to the console and so is not affected by this change. Our use of ReadConsoleOutputCharacterW is only in a test case that verifies that we wrote to the console correctly. Is there some other way we can read back from the console now?
I've updated our test suite to avoid reading back from the console, so the behavior change in ReadConsoleOutputCharacterW no longer affects us.
Since the change was known and intentional, I'll close this issue.
Sorry about the lack of response or better options here - it has been a much busier April and May than we expected. I think it's the right call to avoid reading the console back in test. We have tests that do that, but they're expressly tests of the console subsystem. 🙂