mantisbt icon indicating copy to clipboard operation
mantisbt copied to clipboard

Ability to specify Timeout for SMTP connection

Open enscope opened this issue 8 years ago • 13 comments

Solves a problem with slower SMTP servers, when the message is actually sent, but phpmailer reports "SMTP Error: Data not accepted." when timeout occurs before SMTP response, causing send_emails.php script to end send-out with error and keep the e-mail for subsequent delivery - causing almost infinite number of delivered e-mails when send-out script is cron scheduled.

enscope avatar Jun 15 '16 23:06 enscope

Should not the default be zero?

badfiles avatar Jun 16 '16 00:06 badfiles

I just realised, I was referring to older docs for PHPMailer and current default is 300 seconds, which left me even more baffled with that error. Still, I think, the setting is useful, as I was able to get rid of the issue by explicitly setting a Timeout.

enscope avatar Jun 16 '16 00:06 enscope

Am I understanding right? Your problem does not occur if you set timeout to a value like 60 but it does occur if you don't set timeout although default timeout in PHPMailer is 300.

atrol avatar Jun 16 '16 10:06 atrol

That is per-command timeout, so decreasing it just cannot solve the problem. Very strange, may I see your SMTP log?

badfiles avatar Jun 16 '16 10:06 badfiles

DEFAULT TIMEOUT (300 SEC)

send_emails.php

    mx-mbpro:scripts mx$ /usr/bin/env php -q send_emails.php 
    16. Jun 2016, 14:56 CEST DB database_api.php:156 db_connect() array (
      0 => 'SET NAMES UTF8',
      1 => '0.0005',
    )
    16. Jun 2016, 14:56 CEST DB plugin_api.php:959 plugin_register_installed() array (
      0 => 'SELECT basename, priority, protected FROM mantis_plugin_table WHERE enabled=1 ORDER BY priority DESC',
      1 => '0.0020',
    )
    16. Jun 2016, 14:56 CEST DB config_api.php:136 config_get() array (
      0 => 'SELECT config_id, user_id, project_id, type, value, access_reqd FROM mantis_config_table',
      1 => '0.0030',
    )
    Sending emails...
    16. Jun 2016, 14:56 CEST DB email_queue_api.php:197 email_queue_get_ids() array (
      0 => 'SELECT email_id FROM mantis_email_table ORDER BY email_id ASC',
      1 => '0.0004',
    )
    16. Jun 2016, 14:56 CEST DB email_queue_api.php:170 email_queue_get() array (
      0 => 'SELECT * FROM mantis_email_table WHERE email_id=\'26\'',
      1 => '0.0003',
    )
    16. Jun 2016, 14:57 CEST MAIL email_api.php:1206 email_send() ERROR: Message could not be sent - SMTP Error: data not accepted.SMTP server error: DATA END command failed
    16. Jun 2016, 14:57 CEST MAIL email_api.php:1053 email_send_all() Server not responding for 5 seconds, aborting
    Done.

SMTP LOG

    Jun 16 14:56:57 *** courieresmtpd[27022]: started,ip=[::ffff:***]
    Jun 16 14:57:12 *** courierd[26100]: newmsg,id=0000000000062015.000000005762A21A.00006997, auth=***: dns; *** (*** [::ffff:***])
    Jun 16 14:57:12 *** courierd[26100]: started,id=0000000000062015.000000005762A21A.00006997,from=<***>,module=local,host=***!***!5034!5036!/home/***!!,addr=<***>
    Jun 16 14:57:12 *** courierd[26100]: Waiting.  shutdown time=none, wakeup time=none, queuedelivering=1, inprogress=1
    Jun 16 14:57:12 *** courierlocal[27038]: id=0000000000062015.000000005762A21A.00006997,from=<***>,addr=<***>,size=7138,success: Message delivered.
    Jun 16 14:57:12 *** courierd[26100]: completed,id=0000000000062015.000000005762A21A.00006997
    Jun 16 14:57:12 *** courierd[26100]: Waiting.  shutdown time=Thu Jun 16 15:15:20 2016, wakeup time=Thu Jun 16 15:15:20 2016, queuedelivering=0, inprogress=0

TIMEOUT 60 SECONDS

send_emails.php

    mx-mbpro:scripts mx$ /usr/bin/env php -q send_emails.php 
    16. Jun 2016, 14:57 CEST DB database_api.php:156 db_connect() array (
      0 => 'SET NAMES UTF8',
      1 => '0.0003',
    )
    16. Jun 2016, 14:57 CEST DB plugin_api.php:959 plugin_register_installed() array (
      0 => 'SELECT basename, priority, protected FROM mantis_plugin_table WHERE enabled=1 ORDER BY priority DESC',
      1 => '0.0003',
    )
    16. Jun 2016, 14:57 CEST DB config_api.php:136 config_get() array (
      0 => 'SELECT config_id, user_id, project_id, type, value, access_reqd FROM mantis_config_table',
      1 => '0.0003',
    )
    Sending emails...
    16. Jun 2016, 14:57 CEST DB email_queue_api.php:197 email_queue_get_ids() array (
      0 => 'SELECT email_id FROM mantis_email_table ORDER BY email_id ASC',
      1 => '0.0006',
    )
    16. Jun 2016, 14:57 CEST DB email_queue_api.php:170 email_queue_get() array (
      0 => 'SELECT * FROM mantis_email_table WHERE email_id=\'26\'',
      1 => '0.0004',
    )
    16. Jun 2016, 14:57 CEST DB email_queue_api.php:185 email_queue_delete() array (
      0 => 'DELETE FROM mantis_email_table WHERE email_id=\'26\'',
      1 => '0.0021',
    )
    Done.

SMTP LOG

    Jun 16 14:57:56 *** courierd[26100]: newmsg,id=0000000000062015.000000005762A248.000069AD, auth=***: dns; *** (*** [::ffff:***])
    Jun 16 14:57:56 *** courierd[26100]: started,id=0000000000062015.000000005762A248.000069AD,from=<***>,module=local,host=***!***!5034!5036!/home/***!!,addr=<***>
    Jun 16 14:57:56 *** courierd[26100]: Waiting.  shutdown time=none, wakeup time=none, queuedelivering=1, inprogress=1
    Jun 16 14:57:56 *** courierlocal[27063]: id=0000000000062015.000000005762A248.000069AD,from=<***>,addr=<***>,size=7138,success: Message delivered.
    Jun 16 14:57:56 *** courierd[26100]: completed,id=0000000000062015.000000005762A248.000069AD
    Jun 16 14:57:56 *** courierd[26100]: Waiting.  shutdown time=Thu Jun 16 15:15:20 2016, wakeup time=Thu Jun 16 15:15:20 2016, queuedelivering=0, inprogress=0

NOTES

The point is, both e-mail messages were delivered correctly, even SMTP log was void of any errors, yet PHPMailer reported an error to Mantis. The other thing is, Mantis then did not deleted the e-mail from queue, causing infinite e-mail delivery (which meant, that my client received 100+ e-mails about one change multiple times, which was not good).

As for why this has actually solved the problem, I am a bit baffled, went through the PHPMailer implementation and they are using timeout on multiple places, mostly regarding the SMTP connection and one place, where they set the max_execution_time to timeout if needed - which I think does not apply here as well, as the script completes with error and not just terminates in the middle of execution.

Btw, was there any reason NOT to implement e-mail retry counter (and max retries limit) in e-mail delivery subsystem? If not, I think I would be willing to implement it for the next release, as it is feature, that would prevent such situations - not delivered notification is better than 100+ times delivered notification.

enscope avatar Jun 16 '16 13:06 enscope

@enscope Thank you for your contribution. A few generic comments to start with:

  • We need each change (pull request) to be linked to an issue in our tracker http://mantisbt.org/bugs - please create one if it does not exist, and reference it in the commit message with the phrase Fixes #<bug_id>. We also prefer to hold discussions there on principle, unless they are closely tied to code review.
  • Your PR needs to update the documentation (Admin Guide) for the new config option
  • Kindly follow our guidelines for formatting commit messages

Now to the specifics. Thanks for providing the log files. Your problem is here (from 300s send_emails.php):

  1. Jun 2016, 14:57 CEST MAIL email_api.php:1053 email_send_all() Server not responding for 5 seconds, aborting

So you're looking at the wrong timeout.

The problem is not with PHPMailer, but with the time limit we've implemented in Mantis (see email_api.php). So if anything your config should apply to that setting.

These 5 seconds were implemented a long time ago (see 71c10f8d27a766f50e8ae2a5db771a365d3ce479), and using a hardcoded value in there is certainly not a good idea IMO, therefore I would suggest you revise your PR to replace this magic constant 5 (line 1052) by a your config option which should probably be renamed $g_mail_send_timeout, default to 5.

dregad avatar Jun 16 '16 13:06 dregad

Btw, was there any reason NOT to implement e-mail retry counter (and max retries limit) in e-mail delivery subsystem? If not, I think I would be willing to implement it for the next release, as it is feature, that would prevent such situations - not delivered notification is better than 100+ times delivered notification.

The email API was written a very long time ago, and has many shortcomings... We value every contribution, so if you're willing to improve the code in this area, then by all means go ahead.

Please open a new issue in the tracker to follow up on that feature.

dregad avatar Jun 16 '16 13:06 dregad

@dregad Thank you for your comments, I will adhere to guidelines from now on, and I apologise for not doing so in this PR.

As for the fix itself, are you really sure, that is the fix? Because, I saw that part as well, but it seems to me, that to actually get to that branch of code, you must pass the !$t_email_sent condition (which is possibly related to 16. Jun 2016, 14:57 CEST MAIL email_api.php:1206 email_send() ERROR: Message could not be sent - SMTP Error: data not accepted.SMTP server error: DATA END command failed), just after that, your are even considering the Mantis timeout:

    if( !$t_email_sent ) {
        if( $p_delete_on_failure ) {
            email_queue_delete( $t_email_data->email_id );
        }

        # ...

        if( microtime( true ) - $t_start > 5 ) {
            log_event( LOG_EMAIL, 'Server not responding for 5 seconds, aborting' );
            break;
        }
    }

Therefore, the error result comes directly from email_send(..) function, which invokes the PHPMailer SMTP routines to perform the operation, which is affected by $t_mail->Timeout property.

Or am I reading it wrong?

enscope avatar Jun 16 '16 13:06 enscope

Sorry, you're right - I misunderstood.

So the question is, why is PHPMailer throwing an exception, yet is successfully sending the e-mail ?

In that case, we need to figure out what's happening in PHPMailer, did you try enabling debug mode ($t_mail->SMTPDebug = 2;) ?

dregad avatar Jun 16 '16 14:06 dregad

@dregad I've tried to replicate the issue once again, with the same data, same environment, same SMTP and server configuration and with default timeout and it's gone and it processes the test e-mail and deletes it from the queue, which leaves me even more baffled - I was able to replicate the issue 10+ times with 100% success before. It seems to be some edge case I am encountering here, possibly connected to socket processing in PHPMailer's SMTP class.

It seems, the only thing I can do right now with this, is to follow the issue more.

And I am opening the ticket for "email send-out retries counter and delivery failure," as it seems to be the only way to protect the recipients from rogue behaviour of e-mail notifications.

PS: I've tried to experiment a bit and set the Timeout to some small amount of time - say 5 seconds - and it successfully replicated the error; following is the PHPMailer communication log (shortened) for that case (which we, obviously, can not be 100% certain is the same as when the issue manifested itself before):

    2016-06-16 14:53:24 Connection: opening to ***:587, timeout=5, options=array (
                                          )
    2016-06-16 14:53:24 Connection: opened
    2016-06-16 14:53:24 SERVER -> CLIENT: 220 *** ESMTP
    2016-06-16 14:53:24 CLIENT -> SERVER: EHLO ***
    2016-06-16 14:53:24 SERVER -> CLIENT: 250-*** Ok.
                                          250-AUTH LOGIN PLAIN CRAM-SHA1
                                          250-STARTTLS
                                          250-XCOURIEREXTENSIONS
                                          250-XVERP=Courier
                                          250-XEXDATA
                                          250-XSECURITY=NONE,STARTTLS
                                          250-PIPELINING
                                          250-8BITMIME
                                          250-SIZE
                                          250 DSN
    2016-06-16 14:53:24 CLIENT -> SERVER: STARTTLS
    2016-06-16 14:53:24 SERVER -> CLIENT: 220 Ok

    *** SNIP ***

    2016-06-16 14:53:25 CLIENT -> SERVER: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
    2016-06-16 14:53:25 CLIENT -> SERVER: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
    2016-06-16 14:53:25 CLIENT -> SERVER: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
    2016-06-16 14:53:25 CLIENT -> SERVER:
    2016-06-16 14:53:25 CLIENT -> SERVER: .
    2016-06-16 14:53:30 SERVER -> CLIENT:
    2016-06-16 14:53:30 SMTP ERROR: DATA END command failed:
    2016-06-16 14:53:30 SMTP Error: data not accepted.
    16. Jun 2016, 16:53 CEST MAIL email_api.php:1208 email_send() ERROR: Message could not be sent - SMTP Error: data not accepted.SMTP server error: DATA END command failed
    2016-06-16 14:53:30 CLIENT -> SERVER: QUIT
    2016-06-16 14:53:35 SERVER -> CLIENT:
    2016-06-16 14:53:35 SMTP ERROR: QUIT command failed:
    2016-06-16 14:53:35 Connection: closed
    16. Jun 2016, 16:53 CEST MAIL email_api.php:1053 email_send_all() Server not responding for 5 seconds, aborting
    Done.

enscope avatar Jun 16 '16 15:06 enscope

The issue is that the email client (php mailer) submits the email and then waits from the target server to save it to disk and then reply "success". If the timeout is smaller than time for the server to return success, then client will fail, and then try again. Try again means send the mail again and wait for success, etc. With every retry the email is sent again. It is generally good practice for email servers to de-dupe these retries when delivering the mail to the recipients mailbox. Though it seems that in your case this is not happening. Or it can be that every time we retry we have a different timestamp for the mail and hence the duplicate. The de-dupe is based on from, date, message id. So if we keep these consistent across retries, then the duplicates may go away.

As for retries to fail, if we retry on every new email or every 5 minutes, and we do say 3 retries, then it is very likely to fail during a single outage or error with email submission.

In general, I found that it is better and faster to submit email to the local sendmail and let it worry about retrying and queuing. It makes Mantis faster and uses a much better email client to manage the delivery of such emails.

vboctor avatar Jun 16 '16 15:06 vboctor

@vboctor I do not fully understand your explanation. The problem was, that SMTP was not able to respond with OK soon enough and then PHPMailer considered it a failure. Then, Mantis code reacted to the failure result of PHPMailer and proceeded with checking the internal 5 second timeout, which is in this case too small and will cause to break the send-out cycle only after failure from PHPMailer, but neither solves the actual problem of re-sending already sent message, as Mantis "thinks," the message failed before.

As for the retries and proposed solution for them, I've created ticket for that one. My proposed solution is simpler thought, as I leave it to cron daemon configuration to specify time between retries. Please take a look at that ticket, I am fully willing to implement the change, as I just can not risk the situation, my clients are spammed by Mantis again (they actually proposed reverting to Excel bug tracking after few occurrences of the issue, which is not even sub-optimal).

enscope avatar Jun 16 '16 17:06 enscope