KVIrc icon indicating copy to clipboard operation
KVIrc copied to clipboard

Implement the UTF8ONLY IRCv3 specification

Open progval opened this issue 4 years ago • 17 comments

When the server sends a UTF8ONLY isupport token, override the user configuration and use UTF-8 to send and receive messages.

Relevant bits from https://ircv3.net/specs/extensions/utf8-only:

Servers publishing this token MUST NOT relay content [...] containing non-UTF-8 data to clients.

Clients implementing this specification MUST NOT send non-UTF-8 data to the server once they have seen this token.

If a client implementing this specification sees this token, they MUST set their outgoing encoding to UTF-8 without requiring any user intervention

progval avatar Aug 28 '21 08:08 progval

override the user configuration

???

wodim avatar Aug 28 '21 09:08 wodim

One of the purposes of UTF8ONLY is to prevent misconfigured clients from sending or decoding messages with the wrong charset, so I think it makes sense to override it in this case.

progval avatar Aug 28 '21 09:08 progval

I agree if it doesn't overwrite the configuration, but only applies to the current session. E.g. reconnect after timeout should start with the configured encoding again

сб, 28 авг. 2021 г., 10:16 Val Lorentz @.***>:

One of the purpose of UTF8ONLY is to prevent misconfigured clients from sending or decoding messages with the wrong charset, so I think it makes sense to override it in this case.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kvirc/KVIrc/pull/2514#issuecomment-907598841, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACPLZCP6UTYXYILBB4U5RDT7CSNPANCNFSM5C6325OA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

DarthGandalf avatar Aug 28 '21 09:08 DarthGandalf

Considering the server will send that "utf8only" command every single session, this permanently denies the user the choice of using any other encoding.

A general option that says something like "let servers choose an encoding for me" which is on by default would be a good compromise

wodim avatar Aug 28 '21 10:08 wodim

this permanently denies the user the choice of using any other encoding.

Yes, that's a feature. It means that, once enough clients support it, server operators can safely enable UTF8ONLY, as it will switch all clients, even misconfigured ones.

Letting user configuration override UTF8ONLY nullifies the goals of UTF8ONLY.

progval avatar Aug 28 '21 10:08 progval

I don't understand this thing where you think it's appropriate to implement a backdoor in a client to allow a server to change (or ignore) the configuration without informing the user or asking for consent.

As far as I remember we already default to utf8 on all platforms. Changing the encoding for a particular server is a conscious decision by the user. What will happen when users find that they can no longer understand messages written by other users who have an old version of mirc that uses cp1252 for example?

wodim avatar Aug 28 '21 11:08 wodim

If I'm not mistaken, this PR only changes the encoding for the connection to servers that advertises UTF8ONLY, not for other connections.

What will happen when users find that they can no longer understand messages written by other users who have an old version of mirc that uses cp1252 for example?

If a server advertises UTF8ONLY, it will reject these messages. So with or without this patch, they won't be displayed.

progval avatar Aug 28 '21 11:08 progval

So how do you use base128 scripts with this?

SoniEx2 avatar Aug 28 '21 12:08 SoniEx2

I don't know how these scripts work, but if they send non-UTF8 data, then UTF8ONLY servers will reject them; with or without this patch.

progval avatar Aug 28 '21 12:08 progval

They send all bytes > 0x7F. So they can use a full 7 bits per byte. It works great today.

SoniEx2 avatar Aug 28 '21 12:08 SoniEx2

Did you try on a UTF8ONLY network (eg. ergo.chat)? They are likely to be rejected.

progval avatar Aug 28 '21 13:08 progval

Yes, why break perfectly good stuff that's more efficient than the alternatives?

SoniEx2 avatar Aug 28 '21 13:08 SoniEx2

If it works, it means your script only sends UTF-8 data, so it won't be affected by this patch.

progval avatar Aug 28 '21 13:08 progval

No, your network breaks it. Something that had worked for years without anyone ever complaining.

SoniEx2 avatar Aug 28 '21 13:08 SoniEx2

This is not my network; and I don't see how that's relevant to this discussion. If you don't want UTF8ONLY, you should bring it up there or not use their network. Either way, this patch won't affect your scripts.

progval avatar Aug 28 '21 13:08 progval

Nobody is using "base128 scripts" other than Soni so this is not really a concern. They keep coming up with ridiculous extensions and trying to force developers to implement them to the point that they are banned from several IRC projects (InspIRCd, IRCv3, ircdocs, etc) for trolling.

On a more productive note, over on ircv3-ideas I have proposed a "UTF-8 recommended" ISUPPORT token to be used as part of a migration path to UTF8ONLY which tells clients that they should reconfigure their connection configuration to send UTF-8 data but should be prepared to receive non-UTF-8 data from other users. This will allow users of non-UTF-8 encodings to be automatically migrated over time. I'm likely to implement this into InspIRCd at some point in the future.

SadieCat avatar Aug 28 '21 14:08 SadieCat

Please don't wave around personal attacks as fact.

Also, wasn't there a CHARSET=UTF-8 ISUPPORT already? why's nobody using it?

SoniEx2 avatar Aug 28 '21 14:08 SoniEx2

Since this:

  • doesn't overwrite any config
  • doesn't remove any freedom from the user (you can't send non-utf8 data to a server using this spec anyway)
  • is already implemented in multiple server and clients (hexchat, mirc, ...) I guess it's good to be merged.

ctrlaltca avatar Aug 04 '23 17:08 ctrlaltca