aries-cloudagent-python icon indicating copy to clipboard operation
aries-cloudagent-python copied to clipboard

feat: did-rotate

Open amanji opened this issue 1 year ago • 2 comments

Taken from #2494 (c/o @dbluhm)

This PR is an early implementation of the RFC 0794 DID Rotate protocol. This was an exercise to feel out any potential issues with the protocol as defined.

This is not yet a complete implementation. Notably missing right now is a way to actually trigger a DID rotation through an Admin API call. I also need to finish wiring up the handlers to the manager. And who could forget about tests. 🙂

Overall, the protocol is solid. I'll report my thoughts on the protocol on the PR to the Aries RFCs.

An ACA-Py specific concern I discovered in this process that probably deserves some discussion: invalidating the connection target cache when the DID Rotation is committed. As currently structured, we never really expect the connection targets for a connection to change after the protocol establishing it concludes. DID Rotation shakes that expectation up.

Right now, in my changes, I've added a clear_connection_targets_cache helper method to partially address this. It's partial because it will only work for ACA-Py when run as a single instance with the default in memory cache or if you're using Indicio's redis cache plugin for whole cluster shared caching.

I think it would be better for ACA-Py to use a value including both DIDs of the connection in the cache key as a more complete solution. The problem with this and why I've not done it on a first pass is that there would need to be some restructuring of how we use the Responder classes so that the send methods expect a whole ConnRecord and not just a connection_id (or individually a their_did and my_did, I suppose) so we can extract the DIDs from it. 9 times out of 10, I would say that connection record is already available to the caller of the send method and my hypothesis is that in the remaining 1 time out of 10, it will not dramatically impact performance of critical operations (it is not common for us to know the connection_id and not also have a connection record -- maybe only a few places in the Admin API might be like this).

I'll do some more investigation on these points. Might be good to talk implications at the next ACA-PUG call.

cc @swcurran @TelegramSam FYI @Jsyro this has ties back to updating the DIDs associated with connections that we've talked about recently.

amanji avatar Feb 27 '24 21:02 amanji

Tagging @dbluhm for review, please.

amanji avatar Mar 04 '24 18:03 amanji

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

sonarqubecloud[bot] avatar Mar 14 '24 00:03 sonarqubecloud[bot]

Not sure why this PR is getting so many random integration test fails. Not the same as the ones as I fixed recently. Might try an look at these other ones at some point. Re-running again.

jamshale avatar Mar 14 '24 16:03 jamshale

@jamshale Yeah the tests seem so finicky.

amanji avatar Mar 14 '24 16:03 amanji

@jamshale Yeah the tests seem so finicky.

They seem to get worse sometimes and are better sometimes. The last fail was the one I thought I had fixed. Hadn't seen it in a while. Not sure if adding more retries might help or not.

jamshale avatar Mar 15 '24 17:03 jamshale