notqmail icon indicating copy to clipboard operation
notqmail copied to clipboard

EAI support for notqmail

Open mbhangui opened this issue 4 years ago • 7 comments

This adds EAI support to notqmail. Adapted from Arnt Gulbrandsen unicode address support here.

  1. Use conf-smtputf8 to compile the EAI support
  2. If conf-smtputf8 has -DSMTPUTF8, use tryidn2 to see if idn2_lookup_u8() from libidn2 can be used. if Yes, create hassmtputf8.h and #define SMTPUTF8
  3. include hassmtputf8.h in qmail-smtpd.c, qmail-remote.c
  4. The utf8read() function has been adapted from utf8received() by Erwin Hoffman
  5. Following unit tests cases have been added
    • smtpcode() - in qmail-remote.c (test 3 digit codes, valid and junk codes)
    • mailfrom_parms() in qmail-smtpd.c (test SMTPUTF8 in the MAIL FROM parameter)
    • utf8read() in qmail-remote.c (test utf8, non-utf8)
    • get_capability() in qmail-remote.c to test for presence of EHLO capability

mbhangui avatar Dec 07 '20 18:12 mbhangui

Thank you in particular for taking care to write tests for this functionality, especially since we've seen bugs in the currently available implementations.

When and how should this PR land? My thoughts: since SMTPUTF8 is a relatively small elaboration of qmail's existing 8-bit cleanliness, I'd like to see it built and enabled by default in some future notqmail release. Since the feature introduces a dependency on libidn2, I'd like to defer on-by-default until some later release that brings its own breaking changes. That leaves us merging EAI initially as off-by-default (and perhaps also autoconfigured-if-found). This can happen whenever the PR is ready, so it could potentially be part of 1.09 but shouldn't block that release if not.

schmonz avatar Dec 16 '20 20:12 schmonz

On Thu, 17 Dec 2020 at 02:21, Amitai Schleier [email protected] wrote:

When and how should this PR land? My thoughts: since SMTPUTF8 is a relatively small elaboration of qmail's existing 8-bit cleanliness, I'd like to see it built and enabled by default in some future notqmail release. Since the feature introduces a dependency on libidn2, I'd like to defer on-by-default until some later release that brings its own breaking changes. That leaves us merging EAI initially as off-by-default (and perhaps also autoconfigured-if-found). This can happen whenever the PR is ready, so it could potentially be part of 1.09 but shouldn't block that release if not.

It can be disabled by default by commenting out -DSMTPUTF8 in conf-smtputf8. But the man page will continue to describe the EAI feature.

mbhangui avatar Dec 17 '20 15:12 mbhangui

Should UTF-8 support everywhere really be an option? It is backward compatible with ASCII from the very byte level, so also being able to read UTF-8 might not break anyone's setup...

On the other hand, having the libidn2 dependency optional as we are pulling https://www.gnu.org/software/libunistring/ ?

My ideal would be to provide a tiny idn.c that implements punny code encoding (is that the aim?), but I yet have to provide this.

josuah avatar Jan 22 '21 22:01 josuah

Thank you @mbhangui for bringing this starting point, that puts on the table everything to do.

josuah avatar Jan 24 '21 17:01 josuah

Is there really any need to parse the "with UTF8SMTP" along with checking that the other host is supporting UTF8SMTP ?

If we have an email featuring UTF8 names and we need sending it we are screwed and end-up in the same result: the mail is not delivered. Does failing in a different way worth extra code?

The relevant rfc is RFC 6531 Section 3.6 My understading is that it may be necessary for the smtp client (qmail-remote). But the RFC doesn't make this mandatory.

The MAIL command parameter SMTPUTF8 asserts that a message is an internationalized message or the message being sent needs the SMTPUTF8 support. There is still a chance that a message being sent via the MAIL command with the SMTPUTF8 parameter is not an internationalized message. An SMTPUTF8-aware SMTP client or server that requires accurate knowledge of whether a message is internationalized needs to parse all message header fields and MIME header fields [RFC2045] in the message body. However, this specification does not require that the SMTPUTF8-aware SMTP client or server inspects the message.

And if we have UTF8 in a mail that lacks "with UTF8SMTP" in a Received header, shall we drop the mail, even though it shall also be well-encoded?

We shouldn't be dropping the mail. Rather qmail-remote shouldn't use the SMTPUTF8 extenstion in the MAIL FROM command

UTF8 email (and content in general) does not have problem with software without support for it (excepted for graphical programs that needs to fetch the codepoints), only with those that religiously reject content with the then-be-damned extra bit.

If anyone is having the a reference to some RFC for "with UTF8SMTP", I am highly interested. So far it looks like some indication for debugging which host scrambled the email rather than carrying any useful information.

RFC 6531 (latest) and RFC 5336 (old one)

Hopefully I am not misunderstanding everything.

While going through the commands I realize that the original patch was written as per RFC 5336, hence the header with UTF8SMTP. RFC 5336 has been obsoleted by RFC6531. And the keyword UTF8SMTP should be changed to SMTPUTF8.

4.3.  WITH Protocol Types Sub-Registry of the Mail Transmission Types
      Registry

   IANA has modified or added the following entries in the "WITH
   protocol types" sub-registry under the "Mail Transmission Types"
   registry.

   +--------------+------------------------------+---------------------+
   | WITH         | Description                  | Reference           |
   | protocol     |                              |                     |
   | types        |                              |                     |
   +--------------+------------------------------+---------------------+
   | UTF8SMTP     | ESMTP with SMTPUTF8          | [RFC6531]           |
   | UTF8SMTPA    | ESMTP with SMTPUTF8 and AUTH | [RFC4954] [RFC6531] |
   | UTF8SMTPS    | ESMTP with SMTPUTF8 and      | [RFC3207] [RFC6531] |
   |              | STARTTLS                     |                     |
   | UTF8SMTPSA   | ESMTP with SMTPUTF8 and both | [RFC3207] [RFC4954] |
   |              | STARTTLS and AUTH            | [RFC6531]           |
   | UTF8LMTP     | LMTP with SMTPUTF8           | [RFC6531]           |
   | UTF8LMTPA    | LMTP with SMTPUTF8 and AUTH  | [RFC4954] [RFC6531] |
   | UTF8LMTPS    | LMTP with SMTPUTF8 and       | [RFC3207] [RFC6531] |
   |              | STARTTLS                     |                     |
   | UTF8LMTPSA   | LMTP with SMTPUTF8 and both  | [RFC3207] [RFC4954] |
   |              | STARTTLS and AUTH            | [RFC6531]           |
   +--------------+------------------------------+---------------------+

Good that you asked those questions and I see that there are minor changes to be made.

mbhangui avatar Jan 25 '21 03:01 mbhangui

While going through the commands I realize that the original patch was written as per RFC 5336, hence the header with UTF8SMTP. RFC 5336 has been obsoleted by RFC6531. And the keyword UTF8SMTP should be changed to SMTPUTF8.

4.3.  WITH Protocol Types Sub-Registry of the Mail Transmission Types
      Registry

   IANA has modified or added the following entries in the "WITH
   protocol types" sub-registry under the "Mail Transmission Types"
   registry.

   +--------------+------------------------------+---------------------+
   | WITH         | Description                  | Reference           |
   | protocol     |                              |                     |
   | types        |                              |                     |
   +--------------+------------------------------+---------------------+
   | UTF8SMTP     | ESMTP with SMTPUTF8          | [RFC6531]           |
   | UTF8SMTPA    | ESMTP with SMTPUTF8 and AUTH | [RFC4954] [RFC6531] |
   | UTF8SMTPS    | ESMTP with SMTPUTF8 and      | [RFC3207] [RFC6531] |
   |              | STARTTLS                     |                     |
   | UTF8SMTPSA   | ESMTP with SMTPUTF8 and both | [RFC3207] [RFC4954] |
   |              | STARTTLS and AUTH            | [RFC6531]           |
   | UTF8LMTP     | LMTP with SMTPUTF8           | [RFC6531]           |
   | UTF8LMTPA    | LMTP with SMTPUTF8 and AUTH  | [RFC4954] [RFC6531] |
   | UTF8LMTPS    | LMTP with SMTPUTF8 and       | [RFC3207] [RFC6531] |
   |              | STARTTLS                     |                     |
   | UTF8LMTPSA   | LMTP with SMTPUTF8 and both  | [RFC3207] [RFC4954] |
   |              | STARTTLS and AUTH            | [RFC6531]           |
   +--------------+------------------------------+---------------------+

I misread the above. The protocol type still remans UTF8SMTP. RFC 5336 had just the description worded differently

   The "Mail Transmission Types" registry under the Mail Parameters
   registry is requested to be updated to include the following new
   entries:

   +---------------+----------------------------+----------------------+
   | WITH protocol | Description                | Reference            |
   | types         |                            |                      |
   +---------------+----------------------------+----------------------+
   | UTF8SMTP      | UTF8SMTP with Service      | [RFC5336]            |
   |               | Extensions                 |                      |
   | UTF8SMTPA     | UTF8SMTP with SMTP AUTH    | [RFC4954] [RFC5336]  |
   | UTF8SMTPS     | UTF8SMTP with STARTTLS     | [RFC3207] [RFC5336]  |
   | UTF8SMTPSA    | UTF8SMTP with both         | [RFC3207] [RFC4954]  |
   |               | STARTTLS and SMTP AUTH     | [RFC5336]            |
   +---------------+----------------------------+----------------------+

mbhangui avatar Jan 25 '21 04:01 mbhangui

Is there really any need to parse the "with UTF8SMTP" along with checking that the other host is supporting UTF8SMTP ?

I think I understand what you think is a problem. I'll explain below

If we have an email featuring UTF8 names and we need sending it we are screwed and end-up in the same result: the mail is not delivered. Does failing in a different way worth extra code?

And if we have UTF8 in a mail that lacks "with UTF8SMTP" in a Received header, shall we drop the mail, even though it shall also be well-encoded?

I see a problem. Let us say we are running SMTP service for clients to relay emails to external domains. qmail-smtpd advertises SMTPUTF8 and a client uses EAI and uses internationalized domain names in the FROM or the RCPT for a domain example.com (for example). The mail goes into the queue and qmail-remote finds that example.com doesn't support EAI. We should actually be dropping the mail with error like 553 server does not support internationalized email addresses. If we had not advertised SMTPUTF8 in qmail-smtpd, the mail would have gone through.

One can argue that the original sender will get a bounce and then correct the message, but I'm not sure what we should do. Will read the RFC again to clear up this issue

mbhangui avatar Jan 25 '21 05:01 mbhangui