portalocker icon indicating copy to clipboard operation
portalocker copied to clipboard

lock on NFS mount not working

Open alcoat opened this issue 1 year ago • 5 comments

Hello, I create this issue because it seems to me that portalocker does not work well with NFS mount. The problem was discovered while using cachier package : https://github.com/python-cachier/cachier/issues/128

alcoat avatar Dec 05 '23 17:12 alcoat

Hello there,

I'm the writer of the cachier package, and I'm just chiming in to add that this is the second time file locking on NFS mounts seems to be an issue for us as a user package. This the the first: https://github.com/python-cachier/cachier/issues/86

Thank you for the great package, and we hope to see this get fixed!

Shay

shaypal5 avatar Dec 05 '23 19:12 shaypal5

There have been other reports regarding NFS in the past: #66

Locking with NFS is rather problematic, while it's not entirely impossible to do it, it absolutely kills your performance in my experience. If at all possible I would highly recommend using Redis Locks instead: https://github.com/wolph/portalocker#redis-locks

Having that said, this specific issue seems to be an uncaught exception that I need to fix :)

wolph avatar Dec 06 '23 00:12 wolph

Can this be reopened? (should issues realy be closed when they are stale? I'm questioning this as the report is perfectly valid IMHO and just missing catched exception can still be implemented at least)

carnil avatar Jan 13 '24 05:01 carnil

Sorry about that, it seems I have misconfigured the stalebot.

wolph avatar Jan 13 '24 11:01 wolph

Sorry about that, it seems I have misconfigured the stalebot.

Thank you for re-opening!

carnil avatar Jan 13 '24 12:01 carnil

@wolph Is there a solution being worked on? Or is the label just meant to avoid the stalebot actions?

rpmcginty avatar Jun 19 '24 00:06 rpmcginty

@rpmcginty I believe it's already fixed on dev but I haven't created a new release yet. I'm having issues with recreating so if you could test it, that would be fantastic!

wolph avatar Jun 19 '24 01:06 wolph

certainly! I installed via vcs url but was met with the same issue.

Here is a script I used to test. The working directory is on an AWS EFS file system (NFS)

import portalocker

with portalocker.Lock('test.lock', 'wb') as fh:
    print('wb Locked')
    fh.write('Locked\n'.encode())
print('wb Unlocked')

with portalocker.Lock('test.lock', 'r') as fh:
    print('r Locked')
    print(fh.read())
print('r Unlocked')

I received the following error

wb Locked
--
wb Unlocked
Traceback (most recent call last):
File "/init/test.py", line 12, in <module>
with portalocker.Lock('test.lock', 'rb') as fh:
File "/home/user/merlin_env/lib/python3.9/site-packages/portalocker/utils.py", line 163, in __enter__
return self.acquire()
File "/home/user/merlin_env/lib/python3.9/site-packages/portalocker/utils.py", line 290, in acquire
raise exceptions.LockException(exception)
portalocker.exceptions.LockException: [Errno 9] Bad file descriptor

It appears to fail only on the readonly locks. I should note that it happens for both 'r' and 'rb'

@wolph do we need to modify the usage at all?

rpmcginty avatar Jun 19 '24 18:06 rpmcginty

That looks like the current "fix" is at least effective in that it throws the correct exception now.

Beyond that I'm not sure what else I can do honestly. It appears that the filesystem doesn't support read only locks for this case.

For cases like these I created the RedisLock that work across multiple networked systems and is not limited by filesystems at all.

There's also a small chance that lockf as opposed to flock works in this case, it can be changed by overriding the LOCKER attribute: https://github.com/wolph/portalocker/blob/a0c5c75262477e6f7167802ab26f5a489189c6de/portalocker/portalocker.py#L95

It's should be noted that different systems behave differently with lockf (Linux vs bsd for example)

wolph avatar Jun 19 '24 22:06 wolph

I've published a new release with the fix that gives the correct exceptions. I'm not sure if there's a better fix available here... the underlying filesystem simply doesn't support these locks so there's little else we can do.

A nice and safe option could be to use Redis locks, those always work: https://github.com/wolph/portalocker/?tab=readme-ov-file#redis-locks

wolph avatar Jun 22 '24 22:06 wolph