core icon indicating copy to clipboard operation
core copied to clipboard

Char.isUpper only handles ascii a-z, unlike Char.toUpper

Open drathier opened this issue 7 years ago • 4 comments

Char.toUpper handles non-ascii letters, such as åäö -> ÅÄÖ, but Char.isUpper only handles ascii a-z. I expect them to agree on what characters are upper case.

Same with Char.toLower and Char.isLower.

This is in core 5.1.1 http://package.elm-lang.org/packages/elm-lang/core/5.1.1/Char

drathier avatar Feb 13 '18 14:02 drathier

Thanks for the issue! Make sure it satisfies this checklist. My human colleagues will appreciate it!

Here is what to expect next, and if anyone wants to comment, keep these things in mind.

process-bot avatar Feb 13 '18 14:02 process-bot

@drathier BTW documentation said us that all Classification functions are working only with ASCII. So it's a normal function behavior

ufocoder avatar Feb 28 '18 09:02 ufocoder

The functions that convert to upper/lower case modify more than just ascii a-z. I'm fine with either, but the combination is troublesome.

Personally, I'd like to drop these 4 functions because not all scripts have upper/lower case characters, so relying on them being different like in English is a problem. This is a request to fix an api inconsitency, not a bug report.

drathier avatar Feb 28 '18 16:02 drathier

Apparently which lowercase characters correspond to what uppercase characters is locale-dependent, so doing the non-ascii version properly will be hard. https://stackoverflow.com/questions/12537377/in-haskell-how-can-i-uppercase-a-unicode-character-with-respect-to-current-local

Also, ß is a single German lowercase character, but it maps to two uppercase characters.

A-z only (which breaks any non-english app with user input, like given names), or dropping support entirely seem to be the two options here.

drathier avatar Mar 21 '18 12:03 drathier