nullmailer
nullmailer copied to clipboard
Network issues can cause permanent halt
I've been having lots of intermittent network issues, and discovered recently that nullmailer simply stopped delivering any mail a month ago. Restarting it got a few old messages sent, but it quickly hung again, and a few more times since that.
Probably it should timeout/retry eventually (even if very slow, like a day) and/or send other emails over new connections while/if one is taking a while.
That's puzzling. Nullmailer should retry once the network is back up. What were the last few lines logged by nullmailer-send
before it stopped delivering?
...
Starting delivery, 1 message(s) in queue.
Starting delivery: host: smtp.mail.dashjr.org protocol: smtp file: 1621234935.36483
From: <redacted@redacted> to: <redacted@redacted>
Message-Id: <1621234935.340528.36311.nullmailer@redacted>
smtp: Succeeded: 250 2.0.0 Ok: queued as 8946138A002A
Sent file.
Delivery complete, 0 message(s) remain.
Trigger pulled.
Rescanning queue.
Starting delivery, 1 message(s) in queue.
Starting delivery: host: smtp.mail.dashjr.org protocol: smtp file: 1621236068.51599
From: <root@redacted> to: <root@redacted>
Message-Id: <1621236068.582644.51598.nullmailer@redacted>
(nothing for ~a month; restarted nullmailer here)
Rescanning queue.
Starting delivery, 437 message(s) in queue.
Starting delivery: host: smtp.mail.dashjr.org protocol: smtp file: 1621508416.41242
From: <redacted@redacted> to: <redacted@redacted>
Message-Id: <1621508416.073821.40747.nullmailer@redacted>
smtp: Succeeded: 250 2.0.0 Ok: queued as D5E6338A00A0
Sent file.
...
I had a similar problem I couldn't identify and wrote a patch to let systemd restart nullmailer if it stops responding. See merge request.
We are also suffering from that problem. nullmailer-send just stops to to deliver the messages and the queue fills. But the process is still running and systemd is happy.
It just does not do it's job.
After systemctl restart nullmailer.service
it continues to work for a few days or weeks.
Is there anything useful we could collect for you once it is hanging to debug this further?