winget-cli icon indicating copy to clipboard operation
winget-cli copied to clipboard

CJK font not showing in powershell because of changing to UTF-8 & Consolas

Open SubaruArai opened this issue 3 years ago • 16 comments

Brief description of your issue

When using windows 10 in CJK, (tested in Japanese, but pretty sure it affects CJK) the default encoding isn't UTF-8. sjis1[^1] [^1]: In this case, it's SJIS

When using winget, it temporary changes the encoding to UTF-8, and the font to Consolas. Since Consolas doesn't have CJK characters, nothing is readable. utf82

As a sidenote, finishing winget will revert the encoding and fonts back to default. sjis2

Steps to reproduce

  1. Prepare a fresh install of windows10, in CJK languages.
  2. Install winget through store or github.
  3. Open powershell and enter any winget command. e.g. winget search terminal

Expected behavior

The encoding can change, but the font should not change, or at least use rasterized font (ugly but readable). If possible, Consolas+fallback font (default font) would be really nice.

Actual behavior

The encoding and font both changes, causing CJK characters not able to render. Nothing (including the agreement prompt about terms of transaction!) can be read.

Environment

Windows Package Manager v1.1.13405
Copyright (c) Microsoft Corporation. All rights reserved.
Windows: Windows.Desktop v10.0.19044.1415
パッケージ: Microsoft.DesktopAppInstaller v1.16.13405.0

SubaruArai avatar Jan 05 '22 06:01 SubaruArai

Just to add some background:

  • Many legacy apps in CJK are still not using UTF-8, which I suspect is why UTF-8 still isn't the default on Windows10 worldwide.
  • While PowerShell doesn't work, cmd.exe works.
  • Some may argue to just use another terminal emulator[^1], but Microsoft said that PowerShell is replacing Command Prompt, and I'd argue that the "default" terminal emulator (PowerShell) should be usable in all language settings.

[^1]: yes, windows terminal is great, but most of the time you'll be installing it using winget first.

SubaruArai avatar Jan 05 '22 06:01 SubaruArai

This sounds like it may be a bug with Windows Terminal.

denelon avatar Jan 05 '22 17:01 denelon

@denelon No, this is a bug (or feature) with Powershell, and Windows Terminal has nothing to do with it. The point is: silently changing the encoding while running a program might be a bad idea, especially when this program contains terms of usage. If changing, it needs to ensure that the font using will display all the characters needed. (in this case, maybe by not changing the font?)

Just thinking: If changing font from the program isn't possible, maybe add a conversion layer from the terminal's encoding to UTF-8? Though some characters will not be convertable from UTF-8, and that'll be another problem...

Edit: confirmed that windows terminal doesn't have this issue, deleted lines mentioning it.

SubaruArai avatar Jan 05 '22 23:01 SubaruArai

So are you saying that Windows Terminal doesn't show the issue or that it does? If it does, you need to file a bug with them, if it doesn't then why not just use that instead of the default. It's the terminals job to handle codepage changes

Masamune3210 avatar Jan 06 '22 02:01 Masamune3210

I'm saying that this is NOT related to Windows Terminal. It's an issue with PowerShell.

Let me be clear:

  1. AFAIK, Windows 10 ships with PowerShell as its default terminal emulator.
  2. PowerShell doesn't handle fallback fonts
  3. Windows 10 still ships without defaulting to UTF-8 in CJK countries.
  4. winget will change the encoding silently when running to UTF-8
  5. Combine those, and you've got a nice unusable application (winget) with default settings.

You might argue that this is PowerShell's fault for not handling properly codepage changes, or the user should change the fonts manually. But since that's the default terminal on the targeted platform (windows), it should work out of the box, no matter which locale the system is set to.

SubaruArai avatar Jan 06 '22 04:01 SubaruArai

Here's a SO thread about detecting and changing used fonts in powershell: link Since the link is broken, here's the waybackmachine to the cmdlet: link

I agree that working with multiple encodings is a PITA, so I suggest to change the locale, but prevent powershell to change the font.

SubaruArai avatar Jan 06 '22 05:01 SubaruArai

I checked with windows terminal and confirmed that this issue is not present. Changing all previous posts mentioning about windows terminal.

SubaruArai avatar Jan 06 '22 05:01 SubaruArai

It looks like this isn't a winget bug: https://docs.microsoft.com/en-us/troubleshoot/windows-server/system-management-components/powershell-console-characters-garbled-for-cjk-languages

Launching cmd.exe and launching a PowerShell from there makes the issue go away, so this must be the bug.

jedieaston avatar Jan 06 '22 16:01 jedieaston

@jedieaston Thanks for the info! I've never heard of that, but indeed the shortcut was hardcoded to use Consolas.

So I tried to run directly (workaraound1 from above)... aand it didn't solve the issue. Now it uses Lucida Console (no CJK) for the font with UTF-8. It's interesting how windows just doesn't work™ out of the box! Using workaround2 mentioned above solved this issue, obviously.

to sum it up:

workaround No. encoding at session start font when changed to UTF-8
none non-unicode (S-JIS in Japan) Consolas(non-CJK)
1 same as above Lucida Console(non-CJK)
2 same as above any font(CJK compatible)

I don't know how or if the winget team wants to address this issue, but I guess it's more of a policy problem rather than a technical one at this point, since microsoft abandoned to fix powershell.lnk in windows10.

I'd like to ask to the winget team: do you think this is something that should be fixed on the winget side?

SubaruArai avatar Jan 07 '22 00:01 SubaruArai

I wonder if the font could be a setting in the visual settings for winget. That would allow users to specify any font they want to use, and if not specified (or not a valid font) then the terminal default could be used

Trenly avatar Jan 07 '22 01:01 Trenly

@Trenly While that would be nice, but I think that sould be in another issue. The root cause here is the terminal default itself. (half of the problem, to be percise)

The terminal defaults are fine with default encoding, but since winget changes the encodint to UTF-8, problem arises. Here's a table to make it clear:

way to execute default encoding default font on default encoding encoding while running winget default font on UTF-8
powershell.lnk S-JIS MSゴシック UTF-8 Consolas (non-CJK)
powershell.exe S-JIS MSゴシック UTF-8 Lucida Console (non-CJK)

Note: values are on Windows10, Japanese

But as @jedieaston pointed out, this default font problem is an issue with powershell that microsoft admitted it won't fix.

SubaruArai avatar Jan 07 '22 03:01 SubaruArai

I don’t think there is a VT sequence (https://docs.microsoft.com/en-us/windows/console/console-virtual-terminal-sequences) to change the console font programmatically (or even see what it is). There may be a way to use the older API (that the docs say “please don’t use”) but I feel like there could be unintended side effects (ctrl+c could leave the terminal with the wrong font, for instance). And I don’t know if Windows Terminal is planning on supporting those APIs forever.

My personal choice would be to put this in the docs along with the explanation that using cmd.exe or Windows Terminal will solve the problem. The other thing we could do is have winget —info try to detect if the user is using a CJK locale, running powershell.exe, and on a build < 22000. In that case we could print a special message like “Experiencing garbled or missing characters when running winget? Read this: “ with a link to that docs page to help people out. We’d just have to make sure that’s printed with characters people can read ;)

Does anyone know if there’s a compatibility reason this isn’t being patched in 10/2019? It seems like a big bug.

jedieaston avatar Jan 07 '22 04:01 jedieaston

I've contacted a few of the other teams internally to understand what options we have. Once I am able to pull all the options together, we will discuss the best way to proceed here.

denelon avatar Jan 07 '22 04:01 denelon