purelymail-issues icon indicating copy to clipboard operation
purelymail-issues copied to clipboard

Support UTF-8 Email Addresses

Open ScottPeterJohnson opened this issue 3 years ago • 27 comments

(This issue was imported from Gitea) jeremy on May 12, 2020: As far as I can tell, https://tools.ietf.org/html/rfc6530 provides support for unicode characters in the local part of email addresses. However, it looks like I can't send messages to email addresses with unicode characters, and can't create routing rules for receiving utf-8 emails, either.

Is there any plan to enable this functionality?

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by Scott on May 12, 2020: I think Unicode local addresses should be allowed, but I'll have to see what part of the mailserver library doesn't have it enabled.

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by jeremy on May 12, 2020: When I try to send to an account with a unicode email address, I get: image

~~When I try to make an account routing rule with a unicode email address, I get:~~ It looks like this part works now, at least on the routing rules table. When I try to actually send a message from Gmail to a unicode purelymail account, I get:

"""
The response was:

local-part of envelope RCPT address contains utf8 but remote server did not offer SMTPUTF8
Final-Recipient: utf8-addr; 📥@richards.dev
Action: failed
Status: 5.6.7
Remote-MTA: dns; mailserver.purelymail.com. (18.204.123.63, the server for the domain richards.dev.)
Diagnostic-Code: smtp; local-part of envelope RCPT address contains utf8 but remote server did not offer SMTPUTF8
Last-Attempt-Date: Tue, 12 May 2020 18:15:42 -0700 (PDT)
test
Jeremy Richards
"""

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by Scott on May 13, 2020: Hm, I've checked and it looks like the mailserver library we use doesn't support SMTPUTF8 yet. This might therefore take a while to fix because we'll probably have to add that in (plus IMAP and POP variants) or wait for an update from the library itself. I'd estimate a moderate amount of effort.

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by jeremy on May 13, 2020: What mailserver library are you using? Just curious.

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by Scott on May 13, 2020: Highly modified Apache James.

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by rnkn on May 17, 2020: I have been getting a few sieve failures with error notifications from the server and I think it is related to UTF-8 characters in the sender's name part (not address part), e.g.

An error was encountered while processing this mail with the active sieve script for user "w******@b******n.com". The error encountered was:
Command if (3:1): Test address (3:4): org.apache.james.mime4j.field.address.TokenMgrError: Lexical error at line 1, column 7.  Encountered: "\ufffd" (65533), after : ""

From: Le Cinéma Club <[email protected]>
Subject: Now Showing: Virgil Vernier's SAPPHIRE CRYSTAL
Date: 16 May 2020 at 1:27:41 am AEST
To: William Rankin <w******@b*******.com>
Reply-To: Le Cinéma Club <h*****@lecinemaclub.com>

Is there a way I can rewrite the sieve to avoid the name part at least for the time being?

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by rnkn on May 17, 2020: Another for comparison:

An error was encountered while processing this mail with the active sieve script for user "w******@b******n.com". The error encountered was:
Command if (3:1): Test address (3:4): org.apache.james.mime4j.field.address.TokenMgrError: Lexical error at line 1, column 14.  Encountered: "\ufffd" (65533), after : ""

From: Melbourne Cinémathèque <m********************@westnet.com.au>
Subject: Reminder: nominations for CTEQ Committee close May 20, 11:00pm. AGM May 27, 6:30pm on Zoom. AGM Report attached.
Date: 17 May 2020 at 10:46:45 pm AEST
To: William <w******@b******n.com>
Reply-To: Melbourne Cinémathèque <m********************@westnet.com.au>

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by Scott on May 17, 2020: Ah, your issue actually isn't related to this one rnkn. There's no problem with having a Unicode display name; the Sieve parser is just upset that these emails aren't actually formatting properly. I.E. when you have unicode you have to do

From: "Melbourne Cinémathèque" <m********************@westnet.com.au>

instead of

From: Melbourne Cinémathèque <m********************@westnet.com.au>

Or it's technically invalid. But since the job of a mailserver is to begrudgingly accept minor spec violations, I've swapped the address parser used in Sieve for the lenient one, and that should be available in production in about thirty minutes. I'll let you know if it appears to work then.

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by Scott on May 17, 2020: rnkn's issue should be fixed now. (Hard to test though, since I don't have any malformed email clients to test with.)

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by rnkn on May 17, 2020: Thanks :) I get these newsletters a few times a week so it shouldn't be too long to confirm.

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by rnkn on May 25, 2020: I have since received mail with unquoted UTF-8 name part, so that confirms my unrelated issue is fixed :)

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by jeremy on October 28, 2020: Is this on Apache James' roadmap? Or yours? Curious to know even rough timeline if one is available.

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by Scott on October 28, 2020: I don't think it's likely to be done by Apache James anytime soon. However, I could take a preliminary look myself probably around mid-November. I can't guarantee when it'd be completed though, as I'm just a single dev with lots of things to do.

FYI the reason this is harder than it seems is because the extension involved decided to tack on support for everything in the mail message headers being UTF-8, not just an encoding of the address like for domains, so there's no backwards-compatibility. You can't send a UTF address to a server that doesn't support UTF addresses, and a client that doesn't support UTF can't read UTF mail. (UTF also opens up the usual phishing vulnerability of lookalikes.)

Hopefully anyone with a UTF email address is well aware of these problems and has an alternate.

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by jeremy on May 23, 2021: Hey Scott, just wondering if there's been any progress on this or if progress is planned. Thanks!

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by Scott on May 23, 2021: It's still on my backlog unfortunately, haven't gotten to it yet.

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by wheeles on August 23, 2021: I'm not sure if this is related. I just tried to create a mailbox via IMAP using Apple Mail app which contained a "." in the name. The mailbox was created, but the name was only as far as the "." and everything after the "." is the name of a sub-mailbox that is created within the first mailbox.

As a result when you try to use the Mail Import/Export Tool, it fails if it encounters a mailbox with a "." in the name.

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by Scott on August 23, 2021: The "." is a folder separator. I should probably figure out how to switch that to something modern like "/".

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by jeremy on March 6, 2022: Hey Scott, any idea when this will be prioritized?

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Comment by Scott on March 10, 2022: Unclear- it still seems like it could be a lot of work for a somewhat rare feature, but in the two years (wow time flies) since the issue was originally posted we have done a bit more rewriting of the IMAP/SMTP servers, so it might be more feasible. I'll try to take a look at how much effort it'd really take this week.

ScottPeterJohnson avatar Mar 22 '22 10:03 ScottPeterJohnson

Hey @ScottPeterJohnson , just doing my 6 month ping on this issue. Were you able to look into it at all?

jeremysprofile avatar Sep 05 '22 04:09 jeremysprofile

(Sorry for slow response, I'm a little lazy.) The IMAP part of the server rewrite is closer, since I rewrote a lot of it since March. But it'd still need a heap of changes, since internationalized email is an exciting and complex beast. Honestly kind of seems like Google made it a complicated chore just to up the complexity, but I might get more resources to deal with it in the future.

I know, it seems like it should be simple, it's just allowing UTF into the local part of email addresses. But instead of using existing encoding they decided to make completely UTF native mail, separate from regular mail, and you have to enable extensions on IMAP/POP to even see it, and you have to upconvert (!) old mail to UTF-8 when showing it to clients that opt in, and...

ScottPeterJohnson avatar Oct 18 '22 11:10 ScottPeterJohnson

Hey @ScottPeterJohnson , just doing my 6 month ping on this issue. Do you have an estimate on when you'll have time / resources for something like this?

jeremysprofile avatar Mar 04 '23 21:03 jeremysprofile

Hey @ScottPeterJohnson , just doing my 6 month ping on this issue. Do you have an estimate on when you'll have time / resources for something like this?

jeremysprofile avatar Oct 09 '23 20:10 jeremysprofile

Happy belated issue anniversary! @ScottPeterJohnson , do you have an estimate on when you'll have time / resources for something like this?

jeremysprofile avatar Mar 24 '24 23:03 jeremysprofile

Hey @ScottPeterJohnson , just doing my 6 month ping on this issue. Do you have an estimate on when you'll have time / resources for something like this?

jeremysprofile avatar Sep 09 '24 20:09 jeremysprofile

I would love to see SMTPUTF8 support. I am a Japanese speaker and it would be very useful to be able to use e-mail addresses that contain Kanji characters.

mullbird avatar May 01 '25 12:05 mullbird

Hey @ScottPeterJohnson , just doing my annual ping on this issue. Do you have an estimate on when you'll have time / resources for something like this?

jeremysprofile avatar Sep 27 '25 22:09 jeremysprofile