listmonk icon indicating copy to clipboard operation
listmonk copied to clipboard

Campaign is marked finished though email is not sent to all subscribers

Open jackraj97 opened this issue 11 months ago • 13 comments

Version:

  • listmonk: v3.0.0

Description of the bug and steps to reproduce: Most of my campaigns are sent to very less users and it is marked finished. I'm also unable to resend the campaign.

  1. Can someone please help check why this happens?
  2. Shouldn't listmonk retry the failed subscribers (in case of errors) instead of simply marking the campaign as finished?
  3. Also, why is the campaign send button disabled, when the campaign is not fully sent?

I have attached the campaign list page, performance settings with this issue. I'm using brevo SMTP on port 587 with LOGIN auth protocol.

Screenshots: image image

jackraj97 avatar Mar 01 '24 19:03 jackraj97

What do your logs show?

MaximilianKohler avatar Mar 02 '24 05:03 MaximilianKohler

Logs are always empty.

jackraj97 avatar Mar 03 '24 11:03 jackraj97

listmonk retries e-mails N times as configured in SMTP settings. The lower count indicates that there were errors in sending (despite retries). However, the errors should definitely be logged. You should set an error threshold so that the campaigns are paused on errors and check the error log immediately. Changing settings restarts listmonk and wipes the logs.

knadh avatar Mar 14 '24 12:03 knadh

I'm seeing this too FWIW, my listmonk logs simply show EOF per subscriber and the SMTP server seems to know nothing about it. I'll try to reproduce it locally when I get a bit of time next week and see what's up

rjocoleman avatar Mar 14 '24 23:03 rjocoleman

Ah, EOF indicates a broken network connection with the SMTP.

knadh avatar Mar 15 '24 02:03 knadh

Well my issue of getting lots of errors https://github.com/knadh/listmonk/issues/1717#issuecomment-2013166178 progressed to the campaign completing/failing with no errors. This was sent to 150k subscribers at 10x10 rate:

2024/03/25 09:00:03start processing campaign (Campaign name 2024-03-25 150k)
2024/03/25 09:00:05campaign (Campaign name 2024-03-25 150k) finished

Screenshot 2024-03-25 140950

A verbose log is essential in this situation, as I have no clue who was sent the email, so the only option I have is to send it again to everyone, and people don't like being spammed with the same email.

MaximilianKohler avatar Mar 25 '24 21:03 MaximilianKohler

In case it's helpful, I found this open-source app that works like a verbose log, listing all the emails sent by SES:

SES Dashboard https://sesdashboard.com/ - https://github.com/Nikeev/sesdashboard

It would be greatly preferred to have that built into listmonk though.

Supposedly it can be done in Cloudwatch, but I haven't been able to figure out a way to get it to list the emails; it only lists the number of emails.

MaximilianKohler avatar Mar 26 '24 00:03 MaximilianKohler

Well I figured out a way to kind of get the list of 49 people it was sent to. Please let me know if there's a better way of doing this.

https://github.com/knadh/listmonk/issues/686 gave me the idea of searching the campaign_views table. So I checked the campaign ID of the failed campaign (525) and did:

psql -U listmonk -h localhost -p 5432 listmonk
\dt
SELECT * from campaign_views where campaign_id=525;

It outputs subscriber IDs, so to get the emails you may be able to modify this command: https://github.com/knadh/listmonk/issues/1629#issuecomment-1879822419

Or maybe these commands can be modified to do something similar https://github.com/knadh/listmonk/issues/1562#issuecomment-1897632311

But I'm not sure how exactly for either one.

It would be better to use a command that directly saves it to a file. This might work https://stackoverflow.com/questions/5331320/psql-save-results-of-command-to-a-file.

MaximilianKohler avatar Mar 26 '24 13:03 MaximilianKohler

Hey everyone, i ran into the same issue. Fortunately while testing my setup locally.

Version v3.0.0 (f9120d9 2024-02-04T11:20:27Z, linux/amd64)

I am running a test setup with Mailpit (https://mailpit.axllent.org/) as an SMTP server. Performance configuration: Concurrency: 1 Message rate: 1 Batch size: 1000 Maximum error threshold: 25 Sliding window limit: 300 Messages/hour

While testing there were no errors shown in the log. I think there is another bug with the sliding window limit, which might be related to this issue.

Steps to reproduce:

  1. Set a sliding window limit
  2. Start your sending your campaign
  3. Pause the campaign
  4. BUG: Log shows pipe.go:122: messages exceeded (300) for the window (1h0m0s since 27 Mar 24 15:30 +0000). Sleeping for 59m12s. even if the limit was not even reached
  5. Disable the limit -> listmonk restarts
  6. Unpause the campaign
  7. The campaign is immediately finished, in my case "Sent: 186 / 345 "

screenshot

If the sliding window limit is disabled on campaign start, i can pause and unpause the campaign without issues. In the logs i see start processing campaign and stop processing campaign. Notice the second line in the logs, this was the moment i paused the campaign, there was no "stop processing campaign" logged. At 15:32:20 i disabled the sliding window. There was a warning shown in the UI, that i should pause my campaigns. In the campaign overview it showed as paused, but maybe listmonk did not paused it internally? Notice how at 15:32:46 the campaign jumped to finished in an instant.

Here are the logs:

listmonk_app  | 2024/03/27 15:31:15 manager.go:409: start processing campaign (Copy of Copy of Testkampagne)
listmonk_app  | 2024/03/27 15:31:38 pipe.go:122: messages exceeded (300) for the window (1h0m0s since 27 Mar 24 15:30 +0000). Sleeping for 59m12s.
listmonk_app  | 2024/03/27 15:32:20 init.go:843: reloading on signal ...
listmonk_app  | 2024/03/27 15:32:20 init.go:796: HTTP server shut down
listmonk_app  | 2024/03/27 15:32:21 main.go:102: v3.0.0 (f9120d9 2024-02-04T11:20:27Z, linux/amd64)
listmonk_app  | 2024/03/27 15:32:21 init.go:150: reading config: config.toml
listmonk_app  | 2024/03/27 15:32:21 init.go:289: connecting to db: listmonk_db:5432/listmonk
listmonk_app  | 2024/03/27 15:32:21 init.go:618: media upload provider: filesystem
listmonk_app  | 2024/03/27 15:32:21 init.go:541: loaded email (SMTP) messenger: username@mailpit
listmonk_app  | ⇨ http server started on [::]:9000
listmonk_app  | 2024/03/27 15:32:46 manager.go:409: start processing campaign (Copy of Copy of Testkampagne)
listmonk_app  | 2024/03/27 15:32:46 pipe.go:217: campaign (Copy of Copy of Testkampagne) finished

I hope this helps finding the issue. I am quite satisfied with listmonk in general, but i am worried, that my campaign stops randomly and i need to send mail campaigns twice.

Let me know, if i should open another issue for the sliding window limit warning on pausing campaigns.

Thanks!

stephdin avatar Mar 27 '24 16:03 stephdin

Hey @knadh good news! I think I figured out the problem. In my report a couple comments up https://github.com/knadh/listmonk/issues/1762#issuecomment-2018923332 I was sending out a campaign to 150k people.

  • I discovered the the campaigns are sent out in the reverse order of the list /admin/subscribers/lists/423. So you go to the last page (7500), and those subscribers are emailed first.
  • I clicked through a few of those pages and noticed there were a few pages with only blocklisted emails.
  • I looked in /admin/settings -> performance -> batch size, and saw the number was 500.

I'm pretty sure the issue occurs when all the subscribers in the batch are blocklisted.

MaximilianKohler avatar Mar 30 '24 11:03 MaximilianKohler

@knadh I finally got a chance to save the logs. Below is the error I get when a campaign runs. At the end only 50% of the subscribers receive the emails.

Can you please let me know how I can fix this issue?

2024/06/06 09:16:55 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2376: timed out waiting for free conn in pool
2024/06/06 09:17:06 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2382: timed out waiting for free conn in pool
2024/06/06 09:17:17 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2383: timed out waiting for free conn in pool
2024/06/06 09:17:27 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2393: timed out waiting for free conn in pool
2024/06/06 09:17:37 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2397: timed out waiting for free conn in pool
2024/06/06 09:17:48 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2401: timed out waiting for free conn in pool
2024/06/06 09:17:58 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2405: timed out waiting for free conn in pool
2024/06/06 09:18:08 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2408: timed out waiting for free conn in pool
2024/06/06 09:18:20 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2412: timed out waiting for free conn in pool
2024/06/06 09:18:30 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2413: timed out waiting for free conn in pool
2024/06/06 09:18:40 manager.go:485: error sending message in campaign WordPress Collection #73 - How to Use MailChimp with WordPress and more: subscriber 2416: timed out waiting for free conn in pool
2024/06/06 09:18:40 pipe.go:217: campaign (WordPress Collection #73 - How to Use MailChimp with WordPress and more) finished

performance configuration: image

jackraj97 avatar Jun 13 '24 18:06 jackraj97

@jackraj97 did you look through the other issues that cover that error? https://github.com/knadh/listmonk/issues?q=is%3Aissue+timed+out+waiting+for+free+conn+in+pool

MaximilianKohler avatar Jun 14 '24 01:06 MaximilianKohler

Hi @knadh, We are also facing the same issue as @MaximilianKohler . The list had 200K subscribers and the campaign was marked as finished just after sending to 37 subscribers. We observed that the list has many blocklisted users and the email IDs exists in multiple list. We could not find anything in the logs.

  1. is there a way to turn on the detailed logging
  2. we observed that the total campaign count displayed on the UI includes the blocklisted users in the list and it exactly matches to the list size (it should have excluded the blocklisted users in the list)
  3. How to we extract the remaining unsent recipients so that we can re-target the campaign. listmonk_error listmonk_error2

subhash-ngowda avatar Jul 02 '24 12:07 subhash-ngowda

Any solution to this problem? For my client email is not being sent to anyone.

mitexleo avatar Aug 29 '24 18:08 mitexleo

Any solution to this problem? For my client email is not being sent to anyone.

@MaximilianKohler is right

I have this issue but without any blocked users.

Cyrix126 avatar Sep 05 '24 16:09 Cyrix126

As a note, if the campaign is "finished" and you update the cell "status" of the table "campaigns" to "paused", you can continue the campaign but it will stop again, sometimes after sending some, sometimes immediately.

Cyrix126 avatar Sep 06 '24 07:09 Cyrix126

Any updates on how to solve this ? My campaign was marked as finished after sending 596 / 1500 emails (SES rate limiting). I want to re-send emails

rkcreation avatar Sep 09 '24 09:09 rkcreation

As a workaround, drastically increasing the batch size worked for me. I have only very small lists (<100 subscribers), so I do not know the consequences for larger lists.

For other updates regarding this issue, you can refer to #1931

ohaeusler avatar Sep 09 '24 17:09 ohaeusler

This is being actively tracked and investigated here: https://github.com/knadh/listmonk/issues/1931 - I'll close this thread so that we can consolidate the discussions in one place.

As a workaround, drastically increasing the batch size worked for me. I have only very small lists (<100 subscribers), so I do not know the consequences for larger lists.

This seems to be a clue. I still have not been able to reproduce this (please check the thread on #1931)

knadh avatar Sep 10 '24 05:09 knadh

This seems to be a clue. I still have not been able to reproduce this

You couldn't reproduce the blocklist issue I described? https://github.com/knadh/listmonk/issues/1762#issuecomment-2028016662

However, I think the batch size defaults to 500, and since @ohaeusler is sending to <100 at a time, it couldn't be the same blocklist issue.

MaximilianKohler avatar Sep 10 '24 06:09 MaximilianKohler

Hi @MaximilianKohler. I couldn't. Please see https://github.com/knadh/listmonk/issues/1931#issuecomment-2333743067

The subscribers are always ordered in the ascending order of their ID when batching. The condition to pull the batch is > last_subscriber_id (which is 0 to begin with and is updated with the last ID in each batch when it's done) and then < max_subscriber_id, which is the ID of the last subscriber in all the batches to be processed, so the ID of 150k-th subscriber. The batch query cannot return 0 items because it's essentially doing: Fetch all subscribers between ID 0 and ID 150k and not unsubscribed, and then from the results, slice and get $batch number.

knadh avatar Sep 10 '24 06:09 knadh