cachecontrol icon indicating copy to clipboard operation
cachecontrol copied to clipboard

Bound how long to wait for lock files in FileCache.

Open josephw opened this issue 10 years ago • 5 comments

I was bitten by a stale lockfile in the web cache directory after stopping a prior run: by default, lockfile waits indefinitely with no diagnostics. This change adds a default timeout to cause visible failure rather than a deadlock.

As the file lock in FileCache is only held while writing a new version of the response, long waits are very likely to be due to a stale lock. Pick a reasonable default, but also allow lock_timeout to be specified on construction.

josephw avatar Sep 05 '15 12:09 josephw

@josephw Sorry for a slow reply. This looks reasonable, but we need a test ensuring we're passing the timeout correctly. Basically, mocking the lock_class and ensuring the arg from the constructor gets pass through.

Thanks for the PR!

ionrock avatar Nov 25 '15 23:11 ionrock

I've added a couple of tests, to make sure that a default of 30s is passed through and that a specified value is also passed through.

josephw avatar Nov 30 '15 05:11 josephw

@ionrock ping. Is this good to merge with the added tests? Any other changes that would help?

josephw avatar Jun 13 '16 14:06 josephw

some extra here, as I test FileCache under windows, the default lock_class LockFile, which use LinkLockFile internally is kind of broken under windows, as it stale while acquiring the lock,

although dir lock do works, but after further reading the lockfile package is deprecated, so we actually needs some replacement.

AndCycle avatar Sep 28 '17 09:09 AndCycle

Hi! Whatever it takes to get something like this in would be great. Happy to participate if a maintainer is interested in accepting. I just encountered this in poetry where a stale lock file got left behind and my installs just hung. It took me forever to find that the bug was an infinite timeout on an never-releasing lockfile.

zanieb avatar Sep 14 '22 22:09 zanieb

the lockfile <-> filelock switch, which is now done, was motivated by the poetry use case mentioned above. So far as poetry is concerned there is no need to put a timeout on waiting for a lock.

An MR that changes the behaviour so that a timeout is enabled by default is a breaking change: any users who are relying on the lock being an actual lock, which might be held for longer than that default timeout, would find that their code was broken.

So probably I think this MR is unnecessary - though of course there are users other than poetry - and definitely I think that changing the default behaviour is undesirable.

dimbleby avatar May 31 '23 14:05 dimbleby

@dimbleby thanks for the reply!

the lockfile <-> filelock switch, which is now done, was motivated by the poetry use case mentioned above. So far as poetry is concerned there is no need to put a timeout on waiting for a lock.

Can you link a ref for this? I don't quite follow and I guess I'm just missing some context.

An MR that changes the behaviour so that a timeout is enabled by default is a breaking change:

Definitely agree — a breaking change does not seem okay. It may be worth highlighting the current behavior and allowing a timeout to be passed still — but I'm not sure how much that applies to real use-cases.

zanieb avatar May 31 '23 14:05 zanieb

https://github.com/psf/cachecontrol/pull/284. the yanking of which is compensated for in poetry by https://github.com/python-poetry/poetry/pull/6471

dimbleby avatar May 31 '23 15:05 dimbleby

At least on Linux, filelock being backed by fcntl.flock means it doesn't suffer from the original problem here, because a lock won't survive its process exiting. Using a better lock is a better solution than working around it with a timeout.

josephw avatar Jun 01 '23 02:06 josephw