win32 encoding: fix more inconsistencies
While looking into #2196 I was able to find some more inconsistencies in the win32 encoding. I discovered the following differences:
CTRL + ALT + keycombinations should be sent with the same character code that would be generated byAltGr + key, i.e. either\x00or a character depending on the current layout. I fixed the first case for now, which should fix #2196 in most cases.AltGr + keyshould be sent with character code\x00like above if that combination has no associated character. Right now wezterm falls back on the character code ofkey.SHIFT + ascii_lettersends the correct character, but misses the SHIFT flag. This is caused by thenormalize_shiftfunction. I don't know if this is intentional behaviour for non-windows systems, so I didn't want to change that for now.
While we could try to implement all these different cases, I did some testing and realized the ToUnicode function would actually return the correct character in all these cases if the control keys weren't manually removed from the key set before, as it currently happens here: https://github.com/wez/wezterm/blob/d5060ec3c100c01157252381830c4f1d043cd84d/window/src/os/windows/window.rs#L2143-L2153
However, this would change the KeyEvents wezterm generates, possibly breaking existing configurations and keyboard encoding for allow_win32_input_mode = false.
One possible idea would be to add a new optional field to the KeyEvent structure which is only used when using the win32 encoding. What's your opinion on all this?
I'm not opposed to adding something like a win32_uni_char field alongside the scan_code field:
https://github.com/wez/wezterm/blob/d7fa541745c48ae374072715c4439494d6d2d734/wezterm-input-types/src/lib.rs#L1076-L1078
to hold what we get from ToUnicode in this case.
However! ToUnicode has internal global state that makes it tricky to call multiple times with varying parameters: it can destroy or be surprisingly affected by dead key states, so we'd need to be careful to verify that things are working correctly! I think robust results might be to:
- Save a pristine copy of the
keyswe get from theGetKeyboardStatecall - Later in that function, when we understand the dead key state, call
ToUnicodewith the pristine state to get the data we need. That may require following up with additional call(s) to replay/correct the dead key state afterwards. You can see some of the gymnastics that happens to probe the layout in theKeyboardLayoutInfocode. - We might be able to avoid the tricky parts: if they are only tricky while we are in the middle of dead key processing, then those key presses probably wouldn't make it to the terminal, so perhaps it is valid to only make this extra
ToUnicodecall in the non-dead-key code path?
The last commits are my shot at implementing this. Some notes:
-
I only call
ToUnicodetwice in the same case when it would have been called once anyway. This should ensureToUnicodestays in the correct state. -
I don't call
ToUnicodewhen handling dead keys, I assume the probing process already identified the correct characters, sinceToUnicodeis called with all modifiers intact when probing. -
There is some kind of race condition happening when two KeyEvents are emitted by an invalid dead key combination. Both keys sometimes arrived in the terminal swapped around.
I didn't look far into it, but I assume this happened because the first of the two keys was encoded with the default encoding and sent with
pane.key_downwhile the second key was encoded with the win32 encoding and sent usingpane.writer.write_all.My changes now let both keys be encoded with win32 encoding, this seems to have fixed the problem. But I thought I'd let you know in case it happens again.
All in all, these changes make the win32 encoding a lot more accurate. There's still a few differences, mainly regarding dead key handling, but I'm not sure fixing those is worth the effort.
Thanks for your patience with this; I manually merged this, except for one dead key case that I recall as being important for the non-win32 input mode. We can follow up on that in a separate PR if that still causes problems!