notepad2-mod icon indicating copy to clipboard operation
notepad2-mod copied to clipboard

Input in ANSI Encoding

Open alekhe opened this issue 9 years ago • 11 comments

Notepad2-mod.4.2.25.985_x64.

When ANSI encoding is active, input characters are not recognized—the text is shown in question marks. When OEM or UTF-8 encoding is active, everything is OK.

My ANSI encoding is Windows-1251.

Not in Notepad2-mod.4.2.25.980. notepad2-mod 4 2 25 985_x64

alekhe avatar Sep 23 '16 03:09 alekhe

I've noticed that too, but happens only on Windows 10. No idea why.

XhmikosR avatar Sep 23 '16 06:09 XhmikosR

F8, than choose one encoding which can show your text normally, OK File, endoding, change to Unicode, Ctrl +S 2016-september-25 1474811735

PinoTucana avatar Sep 25 '16 13:09 PinoTucana

That is another thing. Before Windows 10, this worked for ANSI.

XhmikosR avatar Sep 25 '16 14:09 XhmikosR

I've got a somewhat similar problem, too. With the OS configured for Polish everything (keyboard layout, regional settings, language for non-Unicode programs etc.), on Notepad2-mod 4.2.25.985 with ANSI encoding active, typing Polish accented letters (AltGr+L,A,S,C...) results in plain Latin characters being input ("lasc" instead of "łąść"); only one Polish letter works correctly - "ó" (AltGr+O). This happens on Windows 8.1, Windows Server 2012 R2 and Windows 10 1607. The problem did not appear with the previous Notepad2-mod version (4.2.25.980) on either of those OSes.

jberezanski avatar Oct 24 '16 21:10 jberezanski

Then I guess it's https://github.com/XhmikosR/notepad2-mod/commit/7529a6b906f1739188dd31c60af8fd742a7acd20?w=1 and specifically https://github.com/XhmikosR/notepad2-mod/commit/7529a6b906f1739188dd31c60af8fd742a7acd20?w=1#diff-3116791c8fff3a31032997bb45378493R1148

Not sure how to proceed. I mean, I find the change right.

XhmikosR avatar Oct 25 '16 07:10 XhmikosR

The change may be right from the Scintilla component point of view, but not neccessarily from the point of view of the entire application.

For a Windows editor, I would expect the term "ANSI encoding" to mean "encoding in the current system default ANSI code page, as returned by the GetACP() function". All editors I've been using worked that way. I would also expect the editor to use this code page by default when opening text files that do not have an unambiguous encoding marker (the UTF-8 or UTF-16 BOM). At present, notepad2-mod .985 exhibits mixed behavior: it reads such files using the system code page (1250 in my case) and the content is displayed correctly, but then I am unable to enter new accented letters, as I described in the previous post.

Perhaps Scintilla can be configured to use a specific default code page (the output of GetACP()) at run time? (I know nothing about its internals or its API.)

jberezanski avatar Oct 25 '16 20:10 jberezanski

You are welcome to submit a PR in order to have the same behavior as before. Otherwise maybe it's time to revisit setting UTF-8 as default.

XhmikosR avatar Nov 23 '16 22:11 XhmikosR

Thinking about it some more, I no longer believe the change in Scintilla was right at all. Their release notes mention crashes on DBCS systems, but crippling ANSI text editing on all non-US systems is not what I would call a good workaround. I looked at Scintilla API and did not see a way to set an explicit non-DBCS code page (SCI_SETCODEPAGE is intended only for setting DBCS code pages and setting it turns off many Scintilla features).

And UTF-8 might be a good default, but it does not replace the ability to edit text files in native system code page.

I'll work on a PR once I finish rebuilding my dev machine (i.e. in a week or so).

jberezanski avatar Dec 06 '16 08:12 jberezanski

That took a bit longer than a week, but the PR is ready now.

jberezanski avatar Jun 08 '17 20:06 jberezanski

I've already switched to UTF-8 default encoding. And patching Scintilla isn't something I like.

XhmikosR avatar Jun 09 '17 22:06 XhmikosR

I thought you'd be okay with that, as you wrote before. Changing the default only sort of avoids the issue for new files, but it does not bring back the ability to edit files saved in the system default non-unicode code page. Right now, even the UI is misleading: on a system with the default code page set, for example, to 1250 (Eastern European languages), the encoding selection dialog in Notepad2-mod shows "ANSI (Windows-1250)". Yet, if this option is chosen, Scintilla uses the hardcoded 1252 code page. It is not possible anymore to configure Notepad2-mod to automatically use the system default code page.

jberezanski avatar Jun 10 '17 17:06 jberezanski