terminal icon indicating copy to clipboard operation
terminal copied to clipboard

Some Unicode characters appear to be recognised incorrectly

Open pfmoore opened this issue 3 years ago • 10 comments

Environment

Windows build number: 10.0.19041.0 Microsoft Windows NT 10.0.19041.0
Windows Terminal version (if applicable): 1.5.10411.0

Any other software? I used UnicodeInput from here to enter Unicode characters, but other utilities for entering Unicode produce the same issue.

Steps to reproduce

Open a Windows terminal window using cmd or Powershell Enter a character ff (fb00). I did this using Alt-+ to run UnicodeInput, and then entered the code fb00. The character displayed at the prompt is 翿, not ff.

Expected behavior

The ff character gets entered

Actual behavior

The 翿 character is displayed

Analysis

The character code for 翿 is 32767, and indeed any character with a code greater than 32767 appears to be replaced with 翿. So it looks like what is happening is that there is something masking the character to a 16-bit value, rather than accepting a full Unicode character.

Other applications, including Notepad and the basic command prompt (outside of Windows Terminal) don't do this, so it looks like a Windows Terminal specific issue.

Using cut and paste to enter the ff character works fine, it's only when a character is entered "directly" that there is an issue.

pfmoore avatar Feb 28 '21 14:02 pfmoore

and the basic command prompt (outside of Windows Terminal) don't do this, so it looks like a Windows Terminal specific issue

Huh. Well that sure is weird. I dunno why I even assigned Dustin to this one, it straight up sounds like a Terminal/ConPTY issue. Sorry for letting this one linger so long!

zadjii-msft avatar Dec 13 '21 18:12 zadjii-msft

I use the standard hex numpad input method. It works correctly in conhost, but not in Terminal. This input method is disabled by default, but it can be enabled by setting "EnableHexNumpad" (REG_SZ) to "1" in "HKCU\Control Panel\Input Method". Then log off to start a new session.

With this feature, enter a 16-bit character code as hexadecimal digits as follows: press and hold the left alt key; press and release "+" on the numeric keypad; enter up to 4 hexadecimal digits; and then release the left alt key. You don't have to use the numeric keypad for decimal digits, but you can if you want. An application sees a sequence of WM_SYSKEYDOWN, WM_SYSKEYUP, and WM_KEYUP messages, followed by a WM_CHAR translated message that has the 16-bit character code. Entering non-BMP characters is tedious, but possible. Just enter each code in the surrogate pair. For example, enter U+1F609 as the pair U+D83D and U+DE09. 

The problems with this input method in Terminal are that numlock has to be enabled; decimal digits have to be entered on the numeric keypad; and Terminal adds the "+" and non-decimal digits to the input buffer. For example, when entering U+123A, the result is "+aሺ".

eryksun avatar Dec 15 '21 07:12 eryksun

Hey @pfmoore. Thanks for filing this. It's been a while since it's been filed. Is this still occurring in the latest WT version?

carlos-zamora avatar Dec 07 '22 22:12 carlos-zamora

I'm struggling to reproduce this now, as the new version of Terminal seems to lose focus when I try to use UnicodeInput with it.

Actually, running the Autohotkey script

#B::
Send {U+fb00}

and then pressing Win-B in a Terminal prompt demonstrates the issue, and yes, it is still present.

pfmoore avatar Dec 07 '22 23:12 pfmoore

You can follow steps from https://github.com/microsoft/terminal/issues/9879 to reproduce it yourself any time. You don't need any extra tools - just copy-paste from a web page.

snaar avatar Dec 08 '22 00:12 snaar

@snaar Thanks! Remote chance this could be a different issue, though, only owing to this snippet in the original text:

Using cut and paste to enter the ff character works fine, it's only when a character is entered "directly" that there is an issue.

DHowett avatar Dec 08 '22 00:12 DHowett

Well, #9879 was closed as duplicate of this one, so maybe it needs to be reopened then?

snaar avatar Dec 08 '22 01:12 snaar

Oh, perhaps I misread. I thought #9879 was closed as a duplicate of #1503. I couldn't find a reference to this bug over there except in the original comment :smile:

DHowett avatar Dec 08 '22 01:12 DHowett

Oh my bad, I think it is me who misread/misremembered.

snaar avatar Dec 08 '22 01:12 snaar

I can confirm this is different. It does not happen for me with copy/paste, just with (semi-) direct input, such as the autohotkey method I demonstrate above.

pfmoore avatar Dec 08 '22 09:12 pfmoore