piknik icon indicating copy to clipboard operation
piknik copied to clipboard

Client breaks after SMTP server closes connection

Open cortopy opened this issue 2 years ago • 4 comments

Describe the bug

When SMTP server closes connection because of timeout, lettre AsyncSmtpTransport client doesn't recover and fails to deliver message

To Reproduce

I'm using this config

let mailer = AsyncSmtpTransport::<Tokio1Executor>::relay(&smtp_config.relay)
            .unwrap()
            .tls(tls_config)
            .credentials(creds)
            .port(config.smtp.port)
            .build();

and checking how lettre pool behaves with inbucket. Every time I see in inbucket's logs that a connection is closed because of i/o timeout, lettre client will then stop working and error with incomplete response

Expected behavior If the connection is invalid, lettre should create a new one

Environment (please complete the following information):

  • Lettre version: 0.10.0-rc.4
  • OS: Manjaro Linux

Additional context If I'm constantly sending emails, then there is no issue

cortopy avatar Mar 18 '22 17:03 cortopy

I've been debugging this issue a bit more. After enabling the tracing feature, I can reproduce the same behaviour with SmtpTransport.

These are the logs:

Mar 18 20:06:41.520 DEBUG purchase event: lettre::transport::smtp::client::connection: Wrote: NOOP<CRLF>
Mar 18 20:06:41.520 DEBUG purchase event: lettre::transport::smtp::client::connection: << 221 Idle timeout, bye bye<CRLF>
Mar 18 20:06:41.520 DEBUG purchase event: lettre::transport::smtp::pool::sync_impl: reusing a pooled connection
Mar 18 20:06:41.520 DEBUG purchase event: lettre::transport::smtp::client::connection: Wrote: MAIL FROM:<[email protected]><CRLF>
Mar 18 20:06:41.520 DEBUG purchase event: lettre::transport::smtp::client::connection: Wrote: QUIT<CRLF>
Mar 18 20:06:41.520 DEBUG purchase event: lettre::transport::smtp::pool::sync_impl: dropping a broken connection instead of recycling it
lettre::transport::smtp::Error { kind: Response, source: "incomplete response" } // this is the error in the Result

these logs point in the same direction that when the connection is broke, the transport's pool doesn't reconnect again and so email sending fails

cortopy avatar Mar 18 '22 20:03 cortopy

If this is a general issue, this is really bad and makes the current implementation unusable in a real world scenario. Did you investigate this further and can you suggest a fix? I think a broken connection should be easy to detect and then handle by restarting the connection.

svenstaro avatar Mar 23 '22 20:03 svenstaro

I just did a quick search and I found there's a TODO annotation already for both the sync and async implementation: https://github.com/lettre/lettre/blob/9273d24e54b089ddfc75d4b6e6a4faf369addbc2/src/transport/smtp/pool/async_impl.rs#L149-L156

The annotation is consistent with what I've observed. If the connection is not good, then sending email fails. But sending another one immediately after works, because the connection was dropped and a new one is created.

I'm guessing the TODO refers to the fact that a new connection should be established as soon as a stale one is detected?

I'm not familiar with the history of this crate. I can see that it used to use r2d2 for pooling. Has it been dropped? Maybe using r2d2 instead of custom pooling could be more reliable and reduce surface for this type of bug? Just my guess

cortopy avatar Mar 24 '22 15:03 cortopy