postal icon indicating copy to clipboard operation
postal copied to clipboard

Cannot view queue for one of the SMTP servers

Open vishnus opened this issue 7 years ago • 16 comments

Hi all,

We have two SMTP servers set up, of which one of them suddenly shows "We're sorry, but something went wrong" message when we try to access the message queue. (servers/<server_name>/queue) It works fine for the other. What could be the reason? I could find only one 500 internal server error in the logs. Any other place to look for specific logs?

Emails are going fine, getting queued n all. Just that im unable to access the portal.

vishnus avatar Aug 20 '17 16:08 vishnus

Log trace:

[web.1:8139] [2017-08-20T21:47:50.973] INFO -- :   Rendered servers/queue.html.haml within layouts/application (14.4ms)
[web.1:8139] [2017-08-20T21:47:50.973] INFO -- : Completed 500 Internal Server Error in 26ms (ActiveRecord: 2.3ms)
[web.1:8139] [2017-08-20T21:47:50.974] FATAL -- :   
[web.1:8139] [2017-08-20T21:47:50.974] FATAL -- : ActionView::Template::Error (No message found matching provided query {:id=>5254}):
[web.1:8139] [2017-08-20T21:47:50.974] FATAL -- :     1: %ul.messageList
    2:   - for message in messages
    3:     - if message.is_a?(QueuedMessage)
    4:       - message = message.message
    5:     %li.messageList__message
    6:       = link_to organization_server_message_path(organization, @server, message.id), :class => 'messageList__link' do
    7:         .messageList__details{:class => 'messageList__details--' + message.scope}
[web.1:8139] [2017-08-20T21:47:50.974] FATAL -- :   
[web.1:8139] [2017-08-20T21:47:50.974] FATAL -- : lib/postal/message_db/message.rb:13:in `find_one'
lib/postal/message_db/database.rb:36:in `message'
app/models/concerns/has_message.rb:8:in `message'
app/views/messages/_list.html.haml:4:in `block in _app_views_messages__list_html_haml___2663326704246554535_56562900'
app/views/messages/_list.html.haml:2:in `each'
app/views/messages/_list.html.haml:2:in `_app_views_messages__list_html_haml___2663326704246554535_56562900'
app/views/servers/queue.html.haml:21:in `_app_views_servers_queue_html_haml___3156044015280335675_58437840'

vishnus avatar Aug 20 '17 16:08 vishnus

Has the server that doesn't let you see its queue sent any mail yet?

willpower232 avatar Aug 20 '17 16:08 willpower232

yes.. we sent around 100k mails in last couple of days even though it wasn't letting us see the queue

vishnus avatar Aug 20 '17 17:08 vishnus

facing the same issue on my side.

skyrocknroll avatar Aug 23 '17 10:08 skyrocknroll

Same issue occurring here. Often triggered on the last page of the queue.

LaurensN avatar Oct 03 '17 13:10 LaurensN

As a workaround, you can ofcourse delete the offending QueuedMessage.

So in your case @vishnus :

#Start up postal console with 'postal console'

queued_message = QueuedMessage.find_by(message_id: 5254)
#queued_message.destroy

I'm not sure what's causing the issue atm.

LaurensN avatar Oct 03 '17 13:10 LaurensN

@LaurensN ah this is good tip! I had cleared the whole queue from db and got it fixed though. But will keep this in mind. Thanks

vishnus avatar Oct 03 '17 14:10 vishnus

We have the issue too, and the number of missing ids are too many. Deleting one by one doesn't seem to be possible.

arashm avatar Jul 31 '18 14:07 arashm

We've been suffering for a while with this and I've had a look into it.

It looks like a Postal process locked a queued message and then died before being able to unlock or remove it. Eventually the message gets cycled out of the Postal database and the queued message is unable to load in the view and that causes an exception.

I have added a few lines of code to the controller to clean up the queued messages before they go to the view and potentially trigger the show stopping exception.

You can see them here https://github.com/SynergiTech/postal/blob/fixjammedqueue/app/controllers/servers_controller.rb#L72-L84

Please note you will have to postal restart to apply the changes but it seems to work.

@catphish I'm not sure if its worthy of a pull request as it doesn't fix the underlying issue but let me know what you think

willpower232 avatar Aug 01 '18 09:08 willpower232

Hello,

Sadly im facing this here as well (running on latest commit of postal). We have multiple postal instances and all of them seem broken with this stack here:

postal      | 11:20:29 web.1              | => [web.1:27] [2021-01-19T11:20:29.223] INFO -- : Completed 500 Internal Server Error in 27ms (ActiveRecord: 3.3ms)
postal      | 11:20:29 web.1              | => [web.1:27] [2021-01-19T11:20:29.224] FATAL -- :
postal      | 11:20:29 web.1              | => [web.1:27] [2021-01-19T11:20:29.224] FATAL -- : ActionView::Template::Error (No message found matching provided query {:id=>1303349}):
postal      | 11:20:29 web.1              | => [web.1:27] [2021-01-19T11:20:29.224] FATAL -- :     1: %ul.messageList
postal      | 11:20:29 web.1              | =>     2:   - for message in messages
postal      | 11:20:29 web.1              | =>     3:     - if message.is_a?(QueuedMessage)
postal      | 11:20:29 web.1              | =>     4:       - message = message.message
postal      | 11:20:29 web.1              | =>     5:     %li.messageList__message
postal      | 11:20:29 web.1              | =>     6:       = link_to organization_server_message_path(organization, @server, message.id), :class => 'messageList__link' do
postal      | 11:20:29 web.1              | =>     7:         .messageList__details{:class => 'messageList__details--' + message.scope}
postal      | 11:20:29 web.1              | => [web.1:27] [2021-01-19T11:20:29.225] FATAL -- :
postal      | 11:20:29 web.1              | => [web.1:27] [2021-01-19T11:20:29.225] FATAL -- : lib/postal/message_db/message.rb:13:in `find_one'
postal      | 11:20:29 web.1              | => lib/postal/message_db/database.rb:36:in `message'
postal      | 11:20:29 web.1              | => app/models/concerns/has_message.rb:8:in `message'
postal      | 11:20:29 web.1              | => app/views/messages/_list.html.haml:4:in `block in _app_views_messages__list_html_haml___1896528471837123852_69826809507580'
postal      | 11:20:29 web.1              | => app/views/messages/_list.html.haml:2:in `each'
postal      | 11:20:29 web.1              | => app/views/messages/_list.html.haml:2:in `_app_views_messages__list_html_haml___1896528471837123852_69826809507580'
postal      | 11:20:29 web.1              | => app/views/servers/queue.html.haml:21:in `_app_views_servers_queue_html_haml___2927834454464425078_69826807176800'
postal      | 11:20:29 web.1              | => 10.9.0.1 - - [19/Jan/2021:11:20:29 +0000] "GET /500 HTTP/1.1" 500 - 0.0316

Is there any possibility to fix this issue?

saschaarthur avatar Jan 19 '21 11:01 saschaarthur

As a workaround, you can ofcourse delete the offending QueuedMessage.

So in your case @vishnus :

#Start up postal console with 'postal console'

queued_message = QueuedMessage.find_by(message_id: 5254)
#queued_message.destroy

I'm not sure what's causing the issue atm.

we had a lot of messages (stuck and locked in the queue) i used this query here to solve it

QueuedMessage.where("locked_at < '2021-01-01'").first().destroy

saschaarthur avatar Jan 19 '21 11:01 saschaarthur

FWIW we're still using what I added years ago https://github.com/postalhq/postal/issues/328#issuecomment-409518956

This automatically cleans up the problem without having to look up message IDs or open the postal console.

willpower232 avatar Jan 19 '21 11:01 willpower232

Thanks @willpower232 but sadly this didnt seem to work (in 4 instances of postal).

Only cleaning the queued messaged 'by hand' let me open the "queue" view again.

Heres an example which i still have on my console:

<QueuedMessage id: 3630548, server_id: 1, message_id: 3630545, domain: "xxx.eu", locked_by: "host:postal-001 pid:39 procname:worker.5 job:VGWNT...", locked_at: "2020-12-10 10:42:14", retry_after: nil, created_at: "2020-12-10 10:42:14", updated_at: "2020-12-10 10:42:14", ip_address_id: 1, attempts: 0, route_id: nil, manual: nil, batch_key: "outgoing-xxx.eu">

saschaarthur avatar Jan 19 '21 11:01 saschaarthur

The same issue, happens on 1 queue, other works normally.

majkinetor avatar Apr 05 '23 14:04 majkinetor

Yeah, well, what can I say. Same thing here in 2023.

phtmgt avatar Jun 21 '23 07:06 phtmgt

Same thing, not fixed yet.

sudmed avatar Aug 29 '23 18:08 sudmed