terminal icon indicating copy to clipboard operation
terminal copied to clipboard

Setting to choose codepage of new terminal session for each profile.

Open qqkookie opened this issue 1 year ago • 4 comments

Description of the new feature/enhancement

Let user to set initial/startup codepage(CP) of new terminal session in each terminal profile setting. The "Advanced" or "Appearance" setting of "Default" profile and each profile will have radio buttons or dropdown menu to choose one of:

  • Do nothing (old behavior, user default CP unless registry setting is modified. )
  • UTF-8 (CP 65001)
  • Windows-1252 (CP 1252, extended US-ASCII similar to ISO-8859-1(West European CP)
  • Force user default CP, regardless of CP related registry settings.
  • (Optional) Custom CP chosen by user.

Currently, each terminal session starts with CP of user-default CP. If the CP doe not match what user want, user have to change manually CP with chcp.com command or automatically with adding chcp.com to HKCU\SOFTWARE\Microsoft\Command Processor\"AutoRun" registry or modify HKCU\Console\"CodePage" registry entry. All these measures are cumbersome and inflexible. Hard to change setting. Other issue even proposed to set UTF-8 as default CP.

Proposed technical implementation details (optional)

Implement and execute function of CHCP.com command in the terminal session start up routine as chose in the profile setting.

qqkookie avatar Jul 08 '23 07:07 qqkookie

Out of curiosity, do you have a specific reason for that much fine-grained control/? We've got a mind to just default all Terminal sessions to the UTF-8 codepage by default (#1802), with an opt-out to "use the user default codepage". It's {{current_year}} after all, and apps should be doing utf-8 now. But if you've got a more specific use case in mind, that'd be important to know.

zadjii-msft avatar Jul 10 '23 10:07 zadjii-msft

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.

Out of curiosity, do you have a specific reason for that much fine-grained control/? We've got a mind to just default all Terminal sessions to the UTF-8 codepage by default (#1802), with an opt-out to "use the user default codepage". It's {{current_year}} after all, and apps should be doing utf-8 now. But if you've got a more specific use case in mind, that'd be important to know.

Current default behavior of CJK Windows console (Cmd.exe/Mintty/MSTerminal) is setting CJK localized OEM code page like CP-949 (Korea), CP-932 (Japan), CP-936 (Mainland China) etc on session start up. It is old behavior inherited from ancient MS-DOS days. Many non-Unicode (MBCS) Win32 programs (like many old Korean/Japanese games) and console mode command line tools are localized to such MBCS CJK OEM codepage. Not message catalog-localized (or language resource fork) , but hard-coded in exe file code text! Such MBCS tradition has lasted for nearly 30 years. Japanese games steadfastly utilized MBCS code until recently. That is why Application Locale Emulation tool is still used.

There exists Windows system setting option to set UTF-8 (CP-65001) as default MBCS codepage for non-Unicode (MBCS) program. But it is quite unstable and causes many compatibility issue with such MBCS Win32 program. And requires reboot PC to switch default codepage. It is machine-wide (not user-wide) global setting and requires manual registry tweaking and rebooting. So setting UTF-8 (CP-65001) as default MBCS codepage is hardly used by CJK users. CJK codepage CP-949, CP-932 , CP-936 as default MBCS codepage will not go way for CJK Windows user any time soon.

Current best practice is to execute "CHCP 949" or "CHCP 65001" manually at each new console session startup. Quite a nuisance. And it does not work for PowerShell. Auto executing CHCP at session startup with registry setting or Shortcut link is not flexible enough. But to work with UTF-8/Linux based environment like WSL(Windows subsystem for Linux) or Git tool, or MSYS2/MinGW tools, we need frequent UTF-8 console window.

So setting UTF-8 codepage for selected console session (without changing global Windows default MBCS setting) would be quite handy for user working in UTF-8/CP-949 mixed codepage environment.

qqkookie avatar Jul 23 '23 01:07 qqkookie

We need to dedupe some of this

  • #10870
    • I don't think this is a dupe of #15678. This thread seems more about some sort of bug where writing text outside of utf-8 to the console to WSL seems to lose the correct encoding. hmm.
  • #9174
  • #1802
    • This is technically a specific case of the following:
  • #15678
    • This is just --codepage <anything>, not just --codepage 65001
  • #11591
    • This is sorta --codepage USE_THE_REG_VALUE. Conpty ignores everything in HKCU/Console, and that includes the codepage.

zadjii-msft avatar Aug 09 '23 21:08 zadjii-msft