notepad2-mod icon indicating copy to clipboard operation
notepad2-mod copied to clipboard

Automatic Unicode detection needs to be improved

Open pffang opened this issue 11 years ago • 9 comments

Many GBK texts recognized as Unicode no BOM, Unicode no BOM is not usual. I think just need support UTF8 no BOM. Others just need to be treated as system locale.

pffang avatar Aug 12 '14 08:08 pffang

You should take this upstream.

XhmikosR avatar Aug 12 '14 08:08 XhmikosR

Which one?

pffang avatar Aug 12 '14 12:08 pffang

What do you mean "which one"? This is an upstream issue, not an issue with this fork.

XhmikosR avatar Aug 12 '14 15:08 XhmikosR

@XhmikosR I think he wants to know whether you mean scintilla or notepad2 when you say "upstream."

@pffang This would be a Notepad2 issue, sorta. Notepad and Notepad2 both use the Unicode detection heuristics that are built into Windows.

There are a few other things worth noting about Notepad2's Unicode detection, but they're not entirely relevant to this discussion.

QWp6t avatar Aug 12 '14 16:08 QWp6t

@XhmikosR Sorry for the incomplete description. @QWp6t Thanks.

pffang avatar Aug 12 '14 16:08 pffang

Let us know if you hear from Floarian.

Thanks @QWp6t for the help!

XhmikosR avatar Aug 12 '14 17:08 XhmikosR

The following is a reply:

Thanks for your feedback.

Unicode detection is always just a best guess, and personal preferences are subjective. Here in Eruope, I have much more UTF-16 without BOM files, than GBK files, so I'd say GBK is unusual.

You can turn off UTF-16 detection, and just rely on BOMs to identify UTF-16 files: http://www.flos-freeware.ch/development-releases/notepad2-FAQs.html#unicode-detection

--Florian

pffang avatar Aug 13 '14 02:08 pffang

Why no BOM UTF16 text exist in the world?:anguished:

pffang avatar Aug 13 '14 03:08 pffang

For those who do not want to deal with this, you can just turn it off:

SkipUnicodeDetection = 1

http://github.com/svnpenn/dotfiles/blob/d111bb9/notepad2/notepad2.ini#L12

ghost avatar Dec 03 '16 22:12 ghost