Vim icon indicating copy to clipboard operation
Vim copied to clipboard

Capitalizing the German letter ß gives SS instead of ẞ

Open septsea opened this issue 3 years ago • 5 comments

Describe the bug Capitalizing the German letter ß gives SS instead of .

To Reproduce Steps to reproduce the behavior:

  1. Type any text that contains ß. (For example, Gauß und Weierstraß.)
  2. Select the text.
  3. Press gU.
  4. Gauß und Weierstraß is converted to GAUSS UND WEIERSTRASS.

Expected behavior Since there is already the uppercase version of ß (https://glyphsapp.com/learn/localize-your-font-german-capital-sharp-s), I expect the result to be GAUẞ UND WEIERSTRAẞ.

Screenshots If applicable, add screenshots to help explain your problem. If remapping-related, please attach log output: https://github.com/VSCodeVim/Vim#debugging-remappings. Animation

Environment (please complete the following information):

  • Extension (VsCodeVim) version: 1.24.1
  • VSCode version: 1.71.2
  • OS: Windows_NT x64 10.0.25211

Additional context Add any other context about the problem here.

septsea avatar Oct 05 '22 10:10 septsea

Hmm... looks like in my locale, at least, 'ß'.toUpperCase() and 'ß'.toLocaleUpperCase() both return 'SS'. This could be a locale issue though (this stuff is definitely not my wheelhouse).

@septsea Can you please confirm whether the g~ (toggle case) command works correctly on your machine with 'ß'? I noticed gU and gu use the locale-agnostic methods while g~ uses the locale-sensitive methods.

J-Fields avatar Oct 05 '22 18:10 J-Fields

The g~ command converts ß to SS. @J-Fields

septsea avatar Oct 05 '22 21:10 septsea

Yeah, my testing suggests this is an issue with javascript's toLocaleUppercase (at least in common implementations) even with the German locale.

J-Fields avatar Oct 06 '22 02:10 J-Fields

I did some experiments.

  • "Möbelträgerfüße".upper() in Python returns MÖBELTRÄGERFÜSSE.
  • "Möbelträgerfüße".toUpperCase() in Java returns MÖBELTRÄGERFÜSSE.
  • "Möbelträgerfüße".upcase in Ruby returns MÖBELTRÄGERFÜSSE.
  • "Möbelträgerfüße".uppercased() in Swift returns MÖBELTRÄGERFÜSSE.
  • "Möbelträgerfüße".to_uppercase() in Rust returns MÖBELTRÄGERFÜSSE.
  • "Möbelträgerfüße".ToUpper() in C# returns MÖBELTRÄGERFÜßE (instead of MÖBELTRÄGERFÜẞE).
  • strings.ToUpper("Möbelträgerfüße") in Go returns MÖBELTRÄGERFÜßE (instead of MÖBELTRÄGERFÜẞE).

Well. was nonexistent or nonstandard when toUpperCase (or toLocaleUpperCase, or their counterparts in other programming languages) was created.

septsea avatar Oct 06 '22 03:10 septsea

Last year I ran into a description somewhere that there was indeed such a problem with THAT German character. To me it was (and is) a problem because I am working on an application where that translation should return ONE character, not two ("ss").

And now in another application with a TabulaRecta things are going amiss thanks to THAT specific character.

lordofscripts avatar Aug 31 '25 16:08 lordofscripts