Poor handling of UTF-8 characters
Toxic seems so be unable to handle certain UTF-8 characters — specifically those made of 4 bytes.
While I can enter a 3-byte UTF-8 character like the horizontal ellipsis (U+2026 HORIZONTAL ELLIPSIS):
/status away "zZzZ…"
I cannot enter a 4-byte character like ‘😴’ (U+1F634 SLEEPING FACE), it simply doesn't appear after the first double quotation mark.
/status away "
It is also impossible to paste said character onto toxic's commandline.
The terminal in which toxic runs shows both characters as expected.
This is an issue with ncurses unicode support. It uses an old standard which doesn't support certain characters such as the new emojis.
On Wed 15 Apr 2015 13:51 -0700, JFreegman wrote:
This is an issue with ncurses unicode support. It uses an old standard which doesn't support certain characters such as the new emojis.
This is not entirely true. With other ncurses programs you can still input the characters, but they are only displayed with a non-descript rectangle, as far as I've experienced (bash, vim, etc).
(zsh displays the character as "<0001f634>" with reverse colour)
I'm curious how kseistrup is able to display the characters in the terminal correctly though. Maybe I haven't been able to see them because of a bad font set-up.
I'm using “Droid Sans Mono Regular” as font in the terminal emulator.
On Thu 16 Apr 2015 08:46 -0700, Klaus Alexander Seistrup wrote:
I'm using “Droid Sans Mono Regular” as font in the terminal emulator.
I actually tried a vte-based terminal and was able to see the character correctly in some ncurses based programs. Some display it and allow its input and some don't though, so I wonder if they are doing something extra.
In any case I think this is a feature that we want in toxic.
I agree, toxic ought to be able to handle the full range of UTF-8 chars.
To extend this, if someone sends me a 4-byte UTF-8 character (e.g., "🐼"— the panda face), I can see it, but I cannot type it to send to my contacts.
While it is true that ncurses struggles with these things (mainly because of out-of-date wc(s)width() functions), you can get around it by using LD_PRELOAD to load a library with more modern versions. However, even with that workaround (which is why I can see the ones my contacts send me), I still cannot type them with xcompose.
是啊,中文就是乱码呢
是啊,中文就是乱码呢
Yes, Chinese is garbled.
Is this still an issue? I can use those emojis and Chinese characters just fine on st
I'm not sure. It's been quite a while since I used toxic. I'll see if I have a moment to test in the next week. Though, I also wouldn't be offended if this is closed as out-of-date.
Buggy unicode support is a well-known issue and will probably never be fixed unless someone else wants to put in the time. I personally don't have time to mess around with it for longer than I have already.