aries-cloudagent-python icon indicating copy to clipboard operation
aries-cloudagent-python copied to clipboard

Add a distributed lock capability to ACA-Py

Open swcurran opened this issue 5 months ago • 1 comments

ACA-Py is used in highly scaled deployments, including multiple instances running across multiple nodes of a cluster. While to this point we have not seen a need for the implementation of a fully distributed lock to handle a transactional update to data across the entire cluster, it is quite possible that we do or will need such a capability. The issue #3738 is an example of that -- the need for an issuer to apply a unique number from a sequence to a credential across the entire deployment. While a solution for that specific issue was found (a database transaction that had been used previously had been inadvertently dropped in a refactoring), it did raise the question about whether a general purpose distributed lock should be added to ACA-Py.

If this is to be considered, please look at the comments in #3738 about approaches for such an implementation. As part of the work done for that Issue, PR #3782 was created towards that goal. Worked stop when the problem with the refactoring was discovered. That PR could be continued if it is decided that a distributed lock would be helpful.

swcurran avatar Jul 24 '25 16:07 swcurran

PS: We wrote this lil async lock function that uses Redis / Valkey: https://github.com/didx-xyz/acapy/blob/1.3.1-20250702/acapy_agent/utils/async_lock.py

Originally wrote that to solve the cred-rev concurrency issue, but it's no longer needed for that.

We decided to keep this util class because it's helpful for us in the cheqd plugin -- where we need a concurrency lock so that only one resource is written to the ledger at a time (parallel requests will fail). Originally used a file lock for that, but replaced it with this Redis lock, so that shared storage isn't required for multiple agents.

I think it's worth it to contribute the above code, since there's probably a few cases where distributed locking is helpful.

Perhaps a similar class can be written to do the same thing with a file lock, and then make it configurable. Use file lock by default, since it works for single agents / no dependency, and then have notes on how to configure Redis instead, for those that want it.

ff137 avatar Jul 24 '25 19:07 ff137