serenity icon indicating copy to clipboard operation
serenity copied to clipboard

KeyboardMapper: Convert Utf-8 char in String to Utf-32 code point

Open ronak69 opened this issue 1 year ago • 2 comments

Ensure that the Utf-8 encoded "mapping character" in String (from
InputBox) gets converted to Utf-32 code point. Before, only the first
byte of Utf-8 char sequence was used as a Utf-32 code point.

That used to result in incorrect characters getting written in the
keymap file for all the "non-ascii" characters used as the mapping
character.

Had hard time coming up with a concise commit message even for (or maybe because of? :thinking:) such a short diff.

ronak69 avatar Jan 23 '24 18:01 ronak69

Had hard time coming up with a concise commit message even for (or maybe because of? 🤔) such a short diff.

Details need more explanation than the big picture, so that's not surprising. Having a short and clear description is nice, but use more words if you need them. There is no limit on commit message length.

I don't know much about UTF and how this is handled in String. But I'm pretty sure that no conversion happens when using code_points. And Strings store valid UTF-8, not UTF-32.

LucasChollet avatar Jan 23 '24 20:01 LucasChollet

But I'm pretty sure that no conversion happens when using code_points. And Strings store valid UTF-8, not UTF-32.

This implies that i did not explain it well in the commit message :sweat_smile:.

So the String received from the InputBox is going to have Utf-8 encoded "non-ascii" characters that will span multiple bytes. Before the code used to not convert those bytes into a Utf-32 code point.

This patch is calling code_points() that will return an Utf8View and using its iterator to get a code point as an u32/Utf-32.

The previous code only works correctly for ascii characters (that are only one byte in size).

ronak69 avatar Jan 23 '24 21:01 ronak69