postal icon indicating copy to clipboard operation
postal copied to clipboard

All messages queuing up because of io.nil:NilClass

Open phache opened this issue 4 years ago • 3 comments

My server stopped sending out emails for a few days and all messages were queuing up, and I found that in the logs that perhaps there were issues making secure connections? I was seeing these in the logs:

Internal error: NoMethodError: undefined method `io' for nil:NilClass

I had to restart the entire postal server to get messages going. Looks like the errors were around:

      if @smtp_client
        result.secure = @smtp_client.secure_socket?
      end

I'm not sure what could be causing this, could it be the mail servers on receiving end not having compatible or correct SSL connections? If so, why is the entire postal server starting to queue up because of issues with only some receivers?

Example log:

worker_1    | [8] [2021-09-19T18:14:18.382] INFO -- : [S38AEUH2KB] [4::13341 15205] Got queued message with exclusive lock
worker_1    | [8] [2021-09-19T18:14:18.398] INFO -- : [S38AEUH2KB] [4::13341 15205] Parsing message content as it hasn't been parsed before
worker_1    | [8] [2021-09-19T18:14:18.556] INFO -- : [S38AEUH2KB] [4::13341 15205] Internal error: NoMethodError: undefined method `io' for nil:NilClass
worker_1    | [8] [2021-09-19T18:14:18.556] INFO -- : [S38AEUH2KB] [4::13341 15205] /opt/postal/app/config/initializers/smtp_extensions.rb:11:in `secure_socket?'
worker_1    | [8] [2021-09-19T18:14:18.557] INFO -- : [S38AEUH2KB] [4::13341 15205] /opt/postal/app/lib/postal/smtp_sender.rb:112:in `send_message'
worker_1    | [8] [2021-09-19T18:14:18.557] INFO -- : [S38AEUH2KB] [4::13341 15205] /opt/postal/app/app/jobs/unqueue_message_job.rb:385:in `block in perform'
worker_1    | [8] [2021-09-19T18:14:18.557] INFO -- : [S38AEUH2KB] [4::13341 15205] /opt/postal/app/app/jobs/unqueue_message_job.rb:30:in `each'
worker_1    | [8] [2021-09-19T18:14:18.557] INFO -- : [S38AEUH2KB] [4::13341 15205] /opt/postal/app/app/jobs/unqueue_message_job.rb:30:in `perform'
worker_1    | [8] [2021-09-19T18:14:18.557] INFO -- : [S38AEUH2KB] [4::13341 15205] /opt/postal/app/lib/postal/worker.rb:55:in `receive_job'
worker_1    | [8] [2021-09-19T18:14:18.557] INFO -- : [S38AEUH2KB] [4::13341 15205] /opt/postal/app/lib/postal/worker.rb:88:in `block in join_queue'
worker_1    | [8] [2021-09-19T18:14:18.557] INFO -- : [S38AEUH2KB] [4::13341 15205] /usr/local/bundle/gems/bunny-2.14.4/lib/bunny/consumer.rb:56:in `call'
worker_1    | [8] [2021-09-19T18:14:18.557] INFO -- : [S38AEUH2KB] [4::13341 15205] /usr/local/bundle/gems/bunny-2.14.4/lib/bunny/channel.rb:1768:in `block in handle_frameset'
worker_1    | [8] [2021-09-19T18:14:18.558] INFO -- : [S38AEUH2KB] [4::13341 15205] /usr/local/bundle/gems/bunny-2.14.4/lib/bunny/consumer_work_pool.rb:108:in `block (2 levels) in run_loop'
worker_1    | [8] [2021-09-19T18:14:18.558] INFO -- : [S38AEUH2KB] [4::13341 15205] /usr/local/bundle/gems/bunny-2.14.4/lib/bunny/consumer_work_pool.rb:103:in `loop'
worker_1    | [8] [2021-09-19T18:14:18.558] INFO -- : [S38AEUH2KB] [4::13341 15205] /usr/local/bundle/gems/bunny-2.14.4/lib/bunny/consumer_work_pool.rb:103:in `block in run_loop'
worker_1    | [8] [2021-09-19T18:14:18.558] INFO -- : [S38AEUH2KB] [4::13341 15205] /usr/local/bundle/gems/bunny-2.14.4/lib/bunny/consumer_work_pool.rb:102:in `catch'
worker_1    | [8] [2021-09-19T18:14:18.558] INFO -- : [S38AEUH2KB] [4::13341 15205] /usr/local/bundle/gems/bunny-2.14.4/lib/bunny/consumer_work_pool.rb:102:in `run_loop'
worker_1    | [8] [2021-09-19T18:14:18.561] INFO -- : [S38AEUH2KB] [4::13341 15205] Queued message was unlocked

Did anyone run into similar issues?

Thanks

phache avatar Sep 20 '21 12:09 phache

I'd be interested to see if restarting the whole host made it work again.

I might be mistaken but it seems this should be a boolean value so I'd imagine there would be no problem in adding a try catch to default it to false.

The actual check is here it seems https://github.com/postalserver/postal/blob/a735219783127cd0495246ccee89422069da1a1c/config/initializers/smtp_extensions.rb#L10-L12

willpower232 avatar Sep 20 '21 17:09 willpower232

I restarted the postal server by executing postal stop and then postal start but I didn't have to restart the host (the linux machine on which docker was running). Would that change anything?

I have been seeing various softfails with a message concerning dh heys, but I really don't know if that's related or not, it just seems to have to do with SSL as well:

SSL_connect returned=1 errno=0 state=error: dh key too small
Support Ref: WDSVRMVW

phache avatar Sep 20 '21 17:09 phache

that problem has been documented on other issues already I believe, is it working with only the postal restart? Perhaps there was some kind of socket glitch which didn't immediately clear.

willpower232 avatar Sep 20 '21 17:09 willpower232

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Feb 22 '24 00:02 github-actions[bot]