[Feature] Return lease IDs from `IDistributedLock` after obtaining a lock
Is your feature request related to a problem? Please describe.
While working on a DynamoDb implementation of IDistributedLock, I came across an edge case which is currently impossible to avoid.
Imagine the following scenario, for a sweeper running on a single node:
- Run A begins and successfully obtains Lease A on the sweeper resource.
- While Run A is still in progress, Lease A expires
- Run B begins and successfully obtains Lease B
- Run A completes and releases Lease B
This can be guarded against across multiple nodes by keeping a local dictionary of lease IDs, as @preardon has done in the Azure Blob implementation, since if Run B begins on a second node, when Run A completes it wouldn't have Lease B's ID in its dictionary. If both runs occur on a single node, however, the distributed lock has no way of knowing whether the lease it's clearing is the one it originally obtained.
Describe the solution you'd like
Change the signature of ObtainLock and ObtainLockAsync to return a lease ID as a string instead of a boolean result, for example:
Task<string> ObtainLockAsync(string resource, CancellationToken cancellationToken);
If the result is null, then the lock was not obtained. Otherwise it returns the ID of the lease.
Also change the signature of the ReleaseLock and ReleaseLockAsync methods to take a lease ID as a parameter, for example:
Task ReleaseLockAsync(string resource, string leaseId, CancellationToken cancellationToken);
When IDistributedLock is used, store the ID of the obtained lease and pass it to the release method.
Describe alternatives you've considered Two alternative signatures were considered:
- Return the lease ID as an
outparameter. This is possible for the synchronous method, but asoutparameters are not supported inasyncmethods, there would need to be a different approach for theasyncsignature and that could get confusing. - Return a tuple of a success/failure boolean and a string lease ID, instead of just the lease ID. Although success and failure is arguably clearer when using an explicit boolean as part of a tuple, unlike languages like Go tuple returns are not all that common in .NET.
@preardon Another one for you.
@dhickie This is an interesting one, what I've done for Storage blob is store the lease key inside of a dictionary in the provider, so its imposible for Performer A to release Lock B as you need the LeaseId
is this something That you can do for Dynamo?
The issue is that if you have a dictionary of resource IDs to lease IDs, then when Run B begins and successfully obtains a lease, then by the time Run A has finished, the ID of Lease A has been replaced in the dictionary by the ID of Lease B. When Run A goes to release the lock, it can't tell that the ID it obtained isn't the one that's in the dictionary.
@dhickie ah, are you talking about in process locking not locking across seperate process locking?
@dhickie I'm happy to move to this, I'll get a PR in.
@dhickie Please let me know what you think of the pull request above, I know the In Memory does not make use of the lock, but I dont believe it will expire so we won't have the issue you're talking about
@preardon Looks good to me 👍
@dhickie @iancooper One this that I'de like to Float the idea of quickly is instead of returning a string returning a tuple of bool and lockId. I'm just not the biggest fan of returning a null and taking that to mean it failed. Thoughts?
Yeah I thought about the same thing. I agree that implicit meaning from null values isn't the best (though historically fairly common). I don't tend to see tuples getting returned from functions that much, as they don't seem to have really taken off in the same way as other languages. Not that that should stop us, necessarily. I'd originally suggested null as the more common approach but I'd have no real issues with returning a tuple instead if we'd like to avoid that ambiguity.
I don't have a problem with that approach. We use option types, null object pattern etc. in a number of places. Mostly the codebase tends to use for optionality although I expect we are erratic around 'not found' with it sometimes being null and sometimes us returning a null object. We tend to keep exceptions for "crash now".
In this case I think that "go" style (err, value) is fine.
I'll add this to my ADR list, I have a few to catch up on to backfill.
this is now resolved