GPGME::Signature.from wrongly assumes UserID.uid is valid UTF-8
I have a key in my chain which main UID is not valid UTF-8. This leads to an exception in gpgme when calling GPGME::ctx.verify_result.signatures[0].to_s. to_s relies on @from to be a legitimate UTF-8 string. I am loosing track of UserID.uid, where excactly it is read via libgpgme. Maybe my key entry is invalid in the first place, which wouldn't matter as I would expect libgpgme and ruby-gpgme to read any input and make the best of it. The key is (of course) not mine, so I cannot change it.
Okay, I have figured out what piece of code introduces the wrongly encoded string.
In gpgme_n.c:916, the function utf8_str_new is called on the fields returned by gpgme. The fields should be encoded UTF-8 strings but sometimes (due to broken keys) they are not. As we cannot change broken input data, it should not affect performance of the program. A broken input string now introduces an exception in gpgme when later accessing the string in ruby. I suggest to enhance utf8_str_new to at least replace invalid characters.
Thanks, I have pushed a fix for that as 535673d188da06c6cdbd80a69cc67e09387ef2ae, which fallbacks to ASCII-8BIT if there is any invalid character. Would it make sense, or it would be better to replace invalid character?
Does it need special care in the upper parts, then? How is ASCII-8BIT handled when used as a string? Broken characters should not lead to display errors, i.e. reinterpreted as control characters. And they should not provoke exceptions in the ruby parts, e.g. when using to_s or regexp replacement functions. I will test what happens with the current implementation.