coredis icon indicating copy to clipboard operation
coredis copied to clipboard

LuaLock failing to release, part 2

Open coandco opened this issue 9 months ago • 0 comments

After dealing with the new LockReleaseError in #222 (which was most of the problems I was having), I discovered that there's still a small percentage of the time that exiting a with LuaLock block will just never return.

Expected Behaviour

import asyncio
from coredis import RedisCluster
from coredis.recipes.locks import LuaLock
from coredis.exceptions import LockError

class TestLock(LuaLock):
    async def __aexit__(self, exc_type, exc, tb):
        print("Before aexit")
        await super().__aexit__(exc_type, exc, tb)
        print("After aexit")

async def main():
    rclient = RedisCluster(startup_nodes=[{"host": "example", "port": 6372}])
    try:
        async with TestLock(rclient, "examplename", blocking_timeout = 0.1, timeout=10):
            print("entered lock block")
            asyncio.sleep(1)
        print("exited lock block")
    except Exception as e:
        print(f"Hit exception {e!r} when using lock")
        return
    print("after lock")

if __name__ == "__main__":
    asyncio.run(main())

You should always see "before aexit" and "after aexit" if you saw "entered lock block", or you should hit the Exception print if there was an error.

Current Behaviour

On the same decently-large distributed system as in #222, I'm still very occasionally (~once/day) hitting a situation where I see "before aexit" but not "after aexit" and no "hit exception".

Steps to Reproduce

This is a production bug that I'm not sure how to reproduce reliably, other than having a lot of contention for locks. It does seem like it only happens when the lock is held for a very short period of time.

Workaround

For now, I'm considering the following as a workaround:

class Lock(LuaLock):
    def __init__(self, *args, release_timeout: int = 5, **kwargs):
        self.release_timeout = release_timeout
        super().__init__(*args, **kwargs)
    
    async def __aexit__(self, exc_type, exc_val, exc_tb):
        await asyncio.wait_for(super().__aexit__(exc_type, exc_val, exc_tb), timeout=self.release_timeout)

That way I can at least ensure that exiting the lock will always return within a set timeout value (and allow the lock to expire on the Redis side on its own), rather than sometimes just going out to lunch and never coming back.

Your Environment

  • coredis version: 4.17.0
  • Redis version: 6.0.16
  • Operating system: Debian 11
  • Python version: 3.9.2

coandco avatar May 15 '24 18:05 coandco