mailcow-dockerized icon indicating copy to clipboard operation
mailcow-dockerized copied to clipboard

inactive sync job after wrong user/passwd

Open smjaberl opened this issue 2 years ago โ€ข 4 comments

Summary

If a syncjob get's x times an error because of a wrong user or password, the job is flagged as inactive and will not run before a manual activation. I don't know the x. It's hidden.

It would be nice to have an config field for this x. Maybe for each sync job, or for all.

Alternative, the implementation could be changed, so that the job isn't going to be inactive. Maybe the job could be skipped for 30 min if it has such a error.

Motivation

Now, I have to look for the sync jobs. If the users do not get Mail, they contact me and I have to reactive the job.

If there would be a configurable x, I could set it higher and the jobs wouldn't stop running.

Additional context

No response

smjaberl avatar Aug 12 '22 14:08 smjaberl

Perhaps @feldsam can check this. :)

andryyy avatar Aug 13 '22 15:08 andryyy

Hello, it deactivates because there was memroy leaks #4276

feldsam avatar Aug 15 '22 06:08 feldsam

running in the same issues with sync jobs from flaky servers. Users are bugging me, since they do not understand the UI-mechanic of "turning it off and on again".

Adorfer avatar Aug 24 '22 08:08 Adorfer

I have the same problem. Maybe a notification in some sorts (mail, push) would be good to inform the admin or user that there was a problem?

BeyondVertical avatar Sep 03 '22 08:09 BeyondVertical

I have still the same problems and I am running an updated system. Should this be a bug report instead of a feature request?

smjaberl avatar Oct 01 '22 11:10 smjaberl

cc @DerLinkman / @FreddleSpl0it: FYI, @smjaberl pinged me directly on Telegram regarding this issue. Not sure if you both can help?

patschi avatar Oct 03 '22 09:10 patschi

Hello, as I wrote, there was memory leak reported, so we deactivate job after failure, to prevent leak to happen. When I thinking about it, should we allow disabling this feature by config var? @andryyy ?

feldsam avatar Oct 03 '22 09:10 feldsam

Hello, I was the one that reported the issue about the memory leak #4276. Now, after some months running some mailcows with this fix, I also have the same problems. Syncjobs an various mailcows get deactivated for unknown reasons. The syncjob states the authentication failed but the configuration on both sides didn't change. When I reactivate the sync jobs everything works fine instantly. There seems to be no valuable reason why the syncjob was deactivated in the first place.

Maybe the way #4276 was fixed wasn't the best solution. The original problem was a memory leak in the dovecot container. That's where the root of all evil lies. Everything else seems to be just a workaround and not a proper solution.

Choppel avatar Oct 11 '22 07:10 Choppel

Hello, the problem seems to be worse than expected. We have quite a few mailcow instances running on different premises. All of them only as "passive" mailserver that do not directly receive eMails but sync them from different official mail servers. Some of those use a mailserver hosted by us (not a mailcow; further called relayserver). Last weekend we did an update and reboot of this relayserver. Since then some customers were complaining that they don't receive eMails. A revision of all syncjobs in all mailcow instances showed, that on all mailcows that use our relayserver there was at least one syncjob deactivated. All of these inactive syncjobs were deactivated at the exact same time - when I restarted our relayserver.

So the algorithm seems to deactivate syncjobs immediately when the relay server is either starting, shutting down or otherwise inaccessible.

Choppel avatar Oct 12 '22 07:10 Choppel

Hi, thank you @Choppel for your comment. It's the same on my site. If the relay server is one time not available, the sync jobs is deactivated immediately. And it is just a normal case, that a server is sometimes not available because of maintaining or just a connection reset on a local router.

So please fix it!

smjaberl avatar Oct 25 '22 12:10 smjaberl

Hello @feldsam is there an ETA on fixing the issue? Either by fixing the deactivation algorithm or fixing the memory leak inside the dovecot container.

Choppel avatar Nov 07 '22 07:11 Choppel

Hi, I am just contributor and I am busy these months, so no ETA. You can do own contribution, and I can review it. You can also sponsor this feature/fix.

feldsam avatar Nov 07 '22 19:11 feldsam

@Choppel This is unfortunately a very annoying issue and requires manual work to re-enable sync. Can you at least tell us a way to do the activation without the GUI? Is the status in the database or in some conf file that could be adjusted automatically? Thanks

t-lie avatar Dec 07 '22 18:12 t-lie

https://demo.mailcow.email/api/#/Sync%20jobs/Update%20sync%20job

feldsam avatar Dec 07 '22 19:12 feldsam

https://demo.mailcow.email/api/#/Sync%20jobs/Update%20sync%20job

The API is cool and all, but could we please get a notification, when a sync job fails and is automatically deactivated. Also: Please add an option to deactivate API - I can not deactivate it now.

BeyondVertical avatar Dec 08 '22 17:12 BeyondVertical

@BlackScreen Please open new issues for the things you stated as they are new feature requests. IMO this is a discussion about the necessity of the new feature introduced in #4276 and not its usability.

Choppel avatar Dec 11 '22 09:12 Choppel

Yeah, I know. But the root problem is that the jobs deactivate themselves without any notice to me as an admin. So, this is - at least in my opinion - definitely related, because the new โ€žfeatureโ€œ introduces new problems. Donโ€˜t get me wrong, I love Mailcow and I am very thankful for your work, but this is really annoying.

BeyondVertical avatar Dec 11 '22 20:12 BeyondVertical

Hello guys, from time when memory leak was teported, there were new releases with dovecot image. It can be probably already solved. Anyway, I didn't hit that memory leak bug in past. So I think, adding config var to enable auto deactivation can be solution. By default it can be disabled and we will see. If somebody hit memory leak bug, them he can enable auto deactivation until bug will be resolved.

feldsam avatar Dec 11 '22 21:12 feldsam

Good news everyone.

@feldsam Your hint was correct. The bug in the dovecot container seems to be fixed. I am reproducing the problem in #4276 right now and for over an hour there was no increase in docker pids.

The fastest and most elegant way to get everything working again IMO would be to revert the changes introduced in #4540. Judging from the comments in this issue and other issues I don't get the feeling that "automatic syncjob deactivation" is a feature that anyone would actually want/need. It just introduces more problems especially when the syncjob is wrongfully deactivated. This is not a problem in the imapsync_runner script but in imapsync itself. It returns EXIT_AUTHENTICATION_FAILURE_USER1 when the server is not reachable. You will most likely never write a script that can circumvent this fault.

Choppel avatar Dec 12 '22 09:12 Choppel

This sounds really good @Choppel, thanks for your work - and I agree with all of your conclusions. The best would be to get rid of the workaround of deactivating the sync jobs. Looking forward to this getting merged into the release versions soon. Maybe a christmas present? ;)

BeyondVertical avatar Dec 12 '22 10:12 BeyondVertical

Ok guys, @DerLinkman will publish a 2022-11a today.

feldsam avatar Dec 12 '22 10:12 feldsam

Now live in 2022-11a

DerLinkman avatar Dec 12 '22 15:12 DerLinkman

Thanks @feldsam @DerLinkman It now works as expected. Keep up the good work.

Choppel avatar Dec 12 '22 15:12 Choppel

Thank you all, will update my Mailcow tonight. Awesome! :)

BeyondVertical avatar Dec 12 '22 15:12 BeyondVertical