notepad4
notepad4 copied to clipboard
Per-scheme default encoding and line ending
Some file has restriction on encoding and line endings, most common cases are:
- Batch file, should using system ANSI encoding and CR+LF line ending
- Shell script, should use UTF-8 and LF line ending
- ANSI art, should prefer DOS-437 and CR+LF line ending. we have
Open ANSI Art (*.nfo, *.diz) files in DOS-437 mode.
option to handle this (which when enabled, treats*.nfo,
*.diz
as ANSI art regardless of file extension configuration). - Many others only allows or prefers UTF-8 or other encoding, but any line ending is allowed
- Some others only allows LF line ending
🙏Please help to list those restrictions or the preferred encoding and line endings if you know.
Implement this requires large refactoring, as currently scheme (file type) is detected after encoding detection. implement VSCode like per-file extension default encoding and line endings (see https://stackoverflow.com/questions/30082741/change-the-encoding-of-a-file-in-visual-studio-code) would be more complex.
When this is implemented:
- On creating new files, scheme preferred encoding and line ending will be used, saved the time to change encoding/line ending from global default settings (user may even forgot do this, resulting the saved file doesn't work as expected).
- On opening file, scheme preferred encoding will be tried first. This is needed to properly handle 7-bit ASCII file (which can be opened in any encoding), especially when user changed the file to contains some non-ASCII characters.
阿里巴巴开发手册强制要求IDE使用UTF8和LF。 我在这篇博文以及搜索看到这几个,可能不全: 首选LF:c/cpp/h/hpp/idl/msg/sh
批处理bat/cmd必须为ANSI;ASP默认UTF8可以避免乱码问题。
bat/cmd files can be UTF16-LE (code page 1200) too
bat/cmd files can be UTF16-LE (code page 1200) too
No, run a UTF16-LE bat/cmd will fail.
That is INI, VBS/JScript(*.js), PowerShell(*.ps) file can be UTF16-LE, and can't be UTF-8.
But Windows Script Host file (*.wsf) can be ANSI/UTF16-LE and UTF-8, that's great. https://docs.microsoft.com/en-us/previous-versions//67w03h17(v=vs.85)?redirectedfrom=MSDN
And there is a way to use utf-8 no bom in the bat