weechat-otr icon indicating copy to clipboard operation
weechat-otr copied to clipboard

Python 3 UnicodeDecodeError

Open usrbinsam opened this issue 8 years ago • 4 comments

I get an exception in my core buffer when someone sends unicode characters to a channel, like below. I'm on Python 3.6.3.

https://puu.sh/yntz6/1e4f54f40d.png

usrbinsam avatar Nov 16 '17 20:11 usrbinsam

Can you try with the latest Potr (pure-python-otr https://github.com/python-otr/pure-python-otr) manually built from master?

We did some Unicode fixes IIRC

koolfy avatar Nov 16 '17 22:11 koolfy

After blackbit on irc just reported the same problem, I did a bit of digging. It seems that python scripts cannot deal with non-utf-8 input in weechat at all when using python3:

16:01:41 <@FlashCode> when invalid utf-8 is sent to a python 3 callback, there are problems
16:01:54 <@FlashCode> I have no easy solution for this problem
16:02:07 <@FlashCode> for now you should use only signals that return only utf-8

For example, this script triggers a UnicodeDecodeError when it sees non-utf8 input:

python: stdout/stderr (test): UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 67: invalid start byte
python: error in function "message_in_cb"

This has to be fixed in the python plugin for weechat. The upstream bug is https://github.com/weechat/weechat/issues/1389.

tribut avatar Aug 06 '19 14:08 tribut

It appears however that (except for invisible tags not working for messages with non-utf8 content) this does not prevent otr from working. It's just spamming the log.

tribut avatar Aug 06 '19 14:08 tribut

The upstream bug is now fixed in master: https://github.com/weechat/weechat/commit/513f5a1ee7ed88ee43b059827dca434edcc51e13

Also, it appears that the irc_in2_* callback is triggered after decoding, but before the message is used. So maybe we should just be using that. If that works, it could mean we can remove a lot of the charset handling code.

tribut avatar Oct 14 '19 14:10 tribut