offlineimap offlineimap confused after (suspend and) resume

I have at (almost) all times offlineimap running in a tmux. After resuming my compu, offlineimap has problems to reconnect properly. Often it just hangs for a long time (or indefinetely) while nothing seems to happen.

I came up with a solution to the problem by having a resume-script that sends SIGUSR2 to my running offlineimap, whereafter it is automatically restarted.

The problem though, is that sometimes offlineimap takes a long time to exit upon SIGUSR2. Maybe this is again related to the lost tcp connections, timeouts, and such. And essentially the same problem as above.

Any suggestions on how I should solve? I have been reluctant to just send SIGKILL to the process, because I was afraid that this might cause inconsistencies in the repo, or such. But maybe this is what I need and should to do?

Sep 30 '13 07:09 quite

Which version of OfflineIMAP you're using? Which OS?

Sep 30 '13 08:09 konvpalto

offlineimap 6.5.4 on an updated Arch Linux, which means basically latest vanilla versions of everything

Sep 30 '13 11:09 quite

What typically happens after resume and upon sending offlineimap the SIGUSR2:

Terminating after this sync... [seems to hang forever, i press ^C] Terminating NOW (this may take a few seconds)... [hung again....]

Oct 03 '13 11:10 quite

Just wanted to add that I can see the same whilst using 6.5.5 on an updated Arch Linux. Are there any thoughts as of why is this happening?

Feb 06 '14 14:02 aignas

Same here. Offlineimap 6.5.3 on OpenSUSE. I use timeout to fix that and kill offlineimap if timeout returns non zero. Never had issues with broken repo because of that (happily).

Feb 10 '14 20:02 Gonzih

How do you use timeout(1) for that? Are you running offlineimap -o (run-once mode)?

Feb 11 '14 11:02 quite

No, just calling offlineimap by cron.

On Tue, Feb 11, 2014 at 03:55:07AM -0800, Daniel wrote:

How do you use timeout(1) for that? Are you running offlineimap -o (run-once mode)?

Reply to this email directly or view it on GitHub: https://github.com/OfflineIMAP/offlineimap/issues/56#issuecomment-34747606

Best regards, Max

Feb 11 '14 12:02 Gonzih

I'm seeing this issue too (6.5.5 on Arch Linux).

Mar 12 '14 14:03 doy

Same here, 6.5.5 on Arch Linux. Not using cron, just running offlineimap -o.

Mar 16 '14 19:03 rcorre

I am seeing a similar problem on a Macbook Air running Mac OS X 10.9.2 (Mavericks). No cron, just using offlineimap.

Apr 01 '14 20:04 treese

Same here. Offlineimap 6.5.4, Python: 2.7.5, Debian Wheezy (Ubuntu)

Apr 22 '14 19:04 christopherraa

You should try to set socktimeout in general section of your offlineimaprc. It sets timeout on select call, so the process will terminate when no data is recieved within the timeout. Solved the problem for me.

Jun 08 '14 12:06 mlen

Thanks @mlen -- setting socktimeout in the [general] section seems to work.

Jun 08 '14 14:06 rcorre

This workaround helps, but I'm still getting occasional hangs even with the socktimeout option set.

Jun 10 '14 01:06 doy

@mlen's suggestion worked for me, thanks.

Jul 21 '14 04:07 jbmartin

what value of socktimeout did you use?

Sep 09 '14 09:09 choucavalier

I use socktimeout = 10

Sep 09 '14 10:09 rcorre

@murphyslaw480 thanks :+1:

Sep 09 '14 10:09 choucavalier

Requires to be documented in known issues.

Jan 12 '15 13:01 nicolas33

Done in cd962d4.

Feb 13 '15 16:02 nicolas33

As I mentioned, setting socktimeout doesn't actually fix the problem - it makes it less frequent, but it still happens to me all the time even with this set. I don't think this is a sufficient fix.

Feb 13 '15 16:02 doy

Ok. I know current behaviour sucks. Sadly, it's hard to handle this properly so don't expect this to be fixed soon.

Feb 13 '15 16:02 nicolas33

There are two things I would suggest here:

If the first C-c attempts a graceful exit, the second C-c should hard exit.
I'm pretty sure the remaining hanging is from the timeout not being applied to all blocking calls. This is something an audit should be able to catch.

Mar 31 '15 18:03 ezyang

Hi Edward,

Yes, I've already suggested to handle the second Ctrl-c as hard exit.
On resume the timeout might require to wait until the local time is adjusted. Or wait until the timeout is hit. BTW, the broken socket should be better handled. I'm planning a deep refactoring and such issues should be made easier to fix. You might be interested in following the coming changes.

Mar 31 '15 21:03 nicolas33

Still happening on 6.6.1

Feb 16 '16 22:02 dolohow

Fix to force OfflineIMAP to stop with consecutives ctrl+c was merged some days ago. Will be in the next release (6.7.0-rc2). AFAIK, nobody worked on proper resume at wakeup.

Feb 17 '16 01:02 nicolas33

Thank you for the update. If someone could tell me what should be done, maybe I would try to iplement that feature.

Feb 17 '16 07:02 dolohow

I'm also interested in someone fixing this. I have the same issue. :sweat_smile:

(I have offlineimap running as a systemd user service under Arch Linux)

Feb 17 '16 07:02 choucavalier

Naive but still effective approach would be to introduce print statements to find a blocker.
A more advanced way can be to try strace while this can be tricky to map the output to lines of code.
Python includes debugging tools that could be usefull. Most appealing for the purpose might worth a try.
Team working can greatly help. Do share your analysis, success and failures.

Bear in mind there might be more than one blocker.

Feb 17 '16 13:02 nicolas33

Assuming this will be difficult to debug, can we at least implement a SIG{INT,TERM} handler which deletes the lockfile? I currently have to delete .offlineimap/*.lock files every time I resume from suspend because when offlineimap freezes it leaves the lockfiles hanging around.

Jul 12 '16 01:07 pwnage101

offlineimap offlineimap copied to clipboard

offlineimap confused after (suspend and) resume

offlineimap
offlineimap copied to clipboard