valkey-py icon indicating copy to clipboard operation
valkey-py copied to clipboard

`Unclosed ClusterNode object` error triggered by `valkey.asyncio.cluster.ClusterNode.__del__()`

Open jakob-keller opened this issue 8 months ago • 5 comments

valkey-py 6.1.0 appears to handle asyncio cluster connections uncleanly. As a result, I am seeing crashes due to Unclosed ClusterNode object errors being raised in the event loop.

I assume this is what triggers it:

  1. In valkey.asyncio.cluster.NodesManager.initialize(), .set_nodes() is called with remove_old=True argument, if certain conditions are satisfied: https://github.com/valkey-io/valkey-py/blob/ca5c7c53e83d42251ce6c1a37e9cd72be58ba34a/valkey/asyncio/cluster.py#L1406

  2. .set_nodes() creates asyncio tasks to disconnect any 'old' cluster nodes, but fails to await those tasks (the noqa pragma might be telling): https://github.com/valkey-io/valkey-py/blob/ca5c7c53e83d42251ce6c1a37e9cd72be58ba34a/valkey/asyncio/cluster.py#L1199-L1215

  3. The finalizer of valkey.asyncio.cluster.ClusterNode checks for any connections that have not been disconnected yet. Since the disconnection tasks from 2. are not awaited, the failure is triggered: https://github.com/valkey-io/valkey-py/blob/ca5c7c53e83d42251ce6c1a37e9cd72be58ba34a/valkey/asyncio/cluster.py#L1022-L1036

As a potential solution, set_nodes() could be made into a coroutine function that awaits the disconnect tasks before returning.

Client: valkey-py 6.1.0, CPython 3.13.2, containerized Linux on arm64 Database: Amazon ElastiCache for Valkey 8.0.1, t4g based instances, cluster mode enabled

jakob-keller avatar Mar 25 '25 13:03 jakob-keller