did-resolution icon indicating copy to clipboard operation
did-resolution copied to clipboard

DID Resolution Test Suite

Open wip-abramson opened this issue 1 year ago • 4 comments

Attempting to capture and continue the discussion from TPAC

We need to develop a test suite that demonstrates multiple DID Resolver implementations satisfy the MUST statements in the specification at a minimum.

We should develop something that allows DID Resolver implementations to submit either a HTTPS endpoint or a docker container exposing a resolution interface and a collection of test cases (DIDs) whose resolution exercises the MUST statements in the spec. So we would need some way to match DIDs to the statement that they exercise and the test the checks this.

I believe this test suite wouldn't care which DID Methods are being resolved? I.e. Someone might submit a did:btcr resolver and a bunch of did:btcr identifiers and another person might submit a did:tdw resolver with did:tdw identifiers. Both of these together might demonstrate multiple conformant implementations of the DID Resolution specification.

Although noting that as @dmitrizagidulin pointed out, interoperability between DID methods is more complex and multi-faceted but I think this is out of scope in this iteration?

There is this existing test suite which we may be able to adapt - https://github.com/w3c-ccg/did-resolution-test-suite There is also a did:key test suite - https://github.com/w3c-ccg/did-key-test-suite

cc @BigBlueHat for any additional insight

Finally, as @burnburn mentioned, it would be great to find someone willing to champion this work.

wip-abramson avatar Oct 04 '24 15:10 wip-abramson

I agree with this summary, and I agree it shouldn't matter which DID method is used when running a test suite for the DID Resolution spec.

peacekeeper avatar Oct 04 '24 22:10 peacekeeper

interoperability between DID methods is more complex and multi-faceted but I think this is out of scope in this iteration?

Not sure what you mean by "this iteration". It's my expectation that interoperability between DID methods is a major point of this WG.

jandrieu avatar Oct 07 '24 14:10 jandrieu

Not sure what you mean by "this iteration".

I mean this working group I guess.

It's my expectation that interoperability between DID methods is a major point of this WG.

I agree it is a major point of this WG. But as Dimitri highlighted there are many aspects to be considered when assessing if two DIDs are interoperable.

I believe the aspect of interoperability this WG intends to demonstrate is that two DIDs from different DID methods can be resolved following the DID Resolution specification to produce a conformant resolution result. Including, when applicable, a DID Core conformant DID document. If a DID method has a conformant DID resolver, then we are stating it is interoperable with other DID methods that also have conformant revolvers.

However, just because I can resolve did:abc:1234 and did:xyx:987 does not guarantee that my software will be able make use these two DIDs. For example, one of the DID documents may contain a verification method that my software does not understand. Or a service endpoint etc. All of this is out of scope, and may never be in scope for this WG, as Dimitri pointed out these details are primarily worked out at the policy level of individual applications and ecosystems.

wip-abramson avatar Oct 07 '24 14:10 wip-abramson

This was discussed during the #did meeting on 07 February 2025.

w3cbot avatar Feb 07 '25 02:02 w3cbot

This was discussed during the #did meeting on 12 June 2025.

View the transcript

w3c/did-resolution#92

<Wip> w3c-ccg/did-resolution-test-suite

Wip: Test suite for the DID resolution - good start but needs significant changes. 87 MUST statements need tests.

<ottomorac> Seems to be using the cypress testing library

Wip: Resolver URL and set of tests in the /fixtures folder. Does the test suite expect HTTPS url bound to the results. Every resolver needs test cases that may use different DIDs. How are we handling errors? No place to submit resolver implementations

Wip: if there are 87 MUST statements do all need tests?

dmitri: Ideally yes.

Benjamin Young: W3C testing requirements are testing the spec not necessarily the code though you are doing both. Need to assure the test suite is aligned with the spec to make it implementable. Can have requirements on other specs, downstream specs, etc. MUST doesn't mean it's a requirement of the code. Are the tests automatable

Wip: for these resolvers this DID support the implementation and resolve per the spec

Wip: next step to move test suite forward? SHould we continue to work on the CCG test suite or leave them to work on it?

ottomorac: most of this work was done for folks working under Marcus. Is it a you can have untestable MUST statements, is that something that needs to be addressed or left for testing in non-code ways?

bigbluehat: just kept a list of those that they couldn't implement them. Helpful in development. Ultimately had to take them out or comment them out as it concerned people if they appeared in the report. Provided a separate list of those were unimplementable because they weren't for software.

Benjamin Young: this didn't cause any raised eyebrows. These are implementation not test but spec tests

swcurran: should be a list of tests that a DID method should implement. List what you want to test (the inventory), then the DID method would construct DIDS and some bad DIDs to test different conditions. DID resolution test resolves the DID in layered approach. Creating the inventory is key.

Wip: Resolver implementations need to prove the resolve the DIDs in all the cases they are for.

bigbluehat: creating a test suite API sounds like what we're talking about. Just a minimal API. If you were doing something different from VC-API you could wrap it in VC-API then the test suite is always communicating with the same 2 or 3 endpoints. Someone needs to look at the shim to make sure that's not introducing issues. Test suite authors

will be few and stressed to do their work

<Wip> w3c/did-resolution#93

Note Benjamin Young is bigbluehat in the above comments

swcurran: you have a test that's a list of DIDs and run them through the resolver. Any DID resolver needs a way to pass in a DID to it.

<ottomorac> +1 to universal resolver as swcurran suggests

Wip: want DIDs and resolutions options as a package as required by the API

swcurran: it's the DID url not just the DID

<Zakim> JoeAndrieu, you wanted to suggest it's not the resolver but the method that needs it

JoeAndrieu: every resolver doesn't need http interface. A resolver might be a library, does that library need to open of an http socket?

JoeAndrieu: for the sake of testing we're using http

Wip: would like some help moving this forward - perhaps from the CCG

bigbluehat: the test suites are a working group products. They can begin in the CCG but final test suites need to be workgroup produced. Can work on both at the same time. Bitstringstatus list didn't get into the WG until the last minute but that's an exception. The WG needs to say these are our tests.

swcurran: sounds like there is a test suite is in scope for this group. But isn't it just a list of tests the DID method should implement

Wip: Disagree, as we're testing resolvers. This group needs to define the tests then hit the resolver to see they work.

Wip: verifying the test is they are resolved.

Swcurran: need to define a set of tests

bigbluehat: yes, define the tests and see who can implement them. You're testing the implementation not the Spec. One did method testing the resolution spec but there is value in each of the methods showing up with the test suite showing the resolution spec is implemented against the MUST statements.

<Zakim> JoeAndrieu, you wanted to say this is interoperability

Wip: resolvers abstract over the different DID methods.

JoeAndrieu: point of the tests is test the resolver not the methods. If every resolver has a http endpoint. What's the common interface that makes sense so we have a standard interface

Wip: next step - look at the MUST statments and figure out what the test are for them.

Wip: need a place for doing that work

bigbluehat: recoommend putting MUST statements in an md file and go through what it would take to test them.

ottomorac: going back swcurran, leverage what the universal resolver is doing to accomplish these test. Enshrine universal resolver as a way of testing this.

Wip: universal resolver is one way to do this but there should be others.

Wip: agrees with bigbluehat's point that we reach out to others on this.

swcurran: universal resolver could be a good tool for the resolution spec itself. But there is one implementation per DID method. DID methods may need 4 implementations of their DID method and need four ways to resolve. The universal resolver is only one.

bigbluehat: agree's what swcurran's suggesting but for the W3C's process

bigbluehat: the only must is between the resolver and the implementation spec.

<swcurran> For context to what Benjamin is saying -- in the did:webvh group, we are doing all the same things -- inventorying MUSTs, defining tests, creating DIDs for resolution.

<ottomorac> noted

bigbluehat: If there not requirements by W3C we should leave them for the CCG and others. Narrowing for getting WG effort done. If it's not a W3C spec it's not formally needed.

Wip: could define a common data structure as one output.

sriibe 1

scrb


w3cbot avatar Jun 12 '25 16:06 w3cbot

Following on for our discussion last week I reached out to Patrick St Louis who pointed me in the direction of the script to extract normative statements from the spec. This can be found here:

https://github.com/digitalbazaar/test-suite-coverage-metrics/blob/122a836631dceb8a665fbda0bfa9445adb9de31b/app/plugins/analysers.py#L82

Running this against the DID Resolution spec url - https://www.w3.org/TR/did-resolution

Produces the following list: https://gist.github.com/PatStLouis/82725c4cdbf25b64e747c5f5a7785cbc

I took these statements and dumped them into a markdown file which is available on hackmd as suggested here - https://hackmd.io/pxOH-leKR8KD8LOxdb7w5A

Looking at the statements, we have some work to do to add additional context. E.g. 1. If present, the associated value MUST be an ASCII string.

I will start working through that this week.

Then we need to see which if any of these statements are currently tested in the existing test suite and for those not tested decide if and how we are going to test them.

I am thinking we co-opt a special topic call to move this work forward as a group for those interested.

wip-abramson avatar Jun 16 '25 12:06 wip-abramson

This was discussed during the #did meeting on 19 June 2025.

View the transcript

DID Resolution Test Suite Special Topic Call

<ottomorac> w3c/did-resolution#92

Wip: I have extracted the normative statements, approximately 80. Probably need a special topic call to deep dive into how to test the statements.

Wip: So that those interested can engage, we can review the statements and refine them as well as determining how to test them.

Wip: Candidate date is July 9th.


w3cbot avatar Jun 19 '25 16:06 w3cbot

This was discussed during the #did meeting on 17 July 2025.

View the transcript

DID Test Suite Special Topic - 23rd July

<Wip> w3c/did-resolution#92

Wip: We will hold this special topic call focused on test suite, I have run the Digital Bazaar script to extract all the MUST statements from the spec...

Wip: We will discuss what a test might look like.. if you are interested in this work please join the wg call next week


w3cbot avatar Jul 17 '25 15:07 w3cbot

This was discussed during the #did meeting on 24 July 2025.

View the transcript

Debrief DID Test Suite Special Topic - 23rd July w3c/did-resolution#92

ottomorac: first topic: test suite topic call

wip: It was a great discussion. Markus wasn't there, so we want some feedback.
… We want to pick 1-5 normative statements in the spec we feel are stable ...
… Manu?

manu: Good discussion. The things we talked about were like what would be the best way to create the test. 1. We have to extract all the normative statements.
… initial extract: 82 statements.
… Concern that is a lot of statements.
… Concern raised that these tests take a lot of effort
… and we want to avoid problems we had with previous approaches.
… specifically avoiding static tests
… we want to ensure that the test suite can be regularly run to make sure conformance is maintained
… There was a discussion about starting small and focused and get to an end-to-end scenario with at least two different implementations?
… If we can, then we can expand to more tests
… Also discussed how disruptive it is for the spec to change after tests are written.
… So we discussed how we as a group figure out how implementers should run the test
… No one signed up for leading the test suite. Will mentioned willingness to help, as did Digital Bazaar, but neither could take the lead.
… Then, towards the end, we talked about concrete things to move the test suite forward.
… 1. Pick an architecture. (Reuse VC 2.0 or something else)
… I can go into detail about what that means, but if we have consensus on that, the downstream decisions get easier.
… Next question is what are the first 5 tests and what organizations are going to provide endpoints?
… Another key decision to make: do we require that each resolver work against at least one did method? or do resolvers bring their own method

ottomorac: some of these tests might be automated. Was there a discussion about hiring a tester?

manu: I think it would be good for us to take some resolutions with the test suite

<Wip> +1 to resolutions

manu: Will had suggested some proposals. I have some rough ones written up.

ottomorac: let's go to that

manu: I have three rough proposals
… first is to make the test suite did method agnostic
… the alternative is we would need to require that all resolvers resolve a specific DID method for testing, such as a did:key
… So the resolver tells the test suite which methods it supports

markus_sabadello: I think its a good idea based on our experience.

<Wip> *chair hat off* I strongly prefer the BYO DID Method approach

markus_sabadello: when you submit your implementation, you specific a number of DIDs and expected results and we run against that.

ivan: Nothing against it, but to make it automatic, we'll need a clear way to identity each method.

manu: it will literally be a DID that it supports

ivan: ok. that works.

<JoeAndrieu> +1

<ottomorac> PROPOSAL: Make the DID Resolution test suite "DID Method agnostic" (each endpoint specifies legitimate and invalid DIDs to use against the implementation)

<ivan> +1

<manu> +1

<Wip> +1

<JoeAndrieu> +1

<ottomorac> +1

<pchampin> +1

<markus_sabadello> +1

<KevinDean> +1

RESOLUTION: Make the DID Resolution test suite "DID Method agnostic" (each endpoint specifies legitimate and invalid DIDs to use against the implementation)

<TallTed> +1

ottomorac: ok. thanks for that.

manu: two more to consider
… The next one is imperfect, but the general architecture of VC 2.0 can be reused
… I think this is a solid improvement over the original DID test suite
… these are mocha driven tests that output JSON we format for humans
… hundreds of thousands of dollars of work that we can reuse
… Then we just need to write the literal tests
… The only thing I'm concerned about is that Danube Tech did build a test suite.

markus_sabadello: I would have suggested the same thing. All these tests suites (VC-related) have been successful, so starting with that as a basis is great.
… And we can help with specific tests by copying what we have into the VC framework

ottomorac: Does this architecture require the use of Mocha?

manu: Yes. It does use Mocha. We don't have to use it, but it would be more work to replace it
… Let's reuse as much as possible

bigbluehat: what we're getting from Mocha is the reporter integration. We could use something else, but it would basically be starting over.
… The tests can be rewritten in another language, but the framework with mocha/chai/javascript would be non-trivial to replace that

<ottomorac> PROPOSED: For the DID RESOLUTION test suite - adopt the test suite infrastructure that was used for VC v2.0; configuration based, HTTP API driven, runs at a regular interval, generates JSON output, formatted into a human-readable report.

<Wip> +1

<JoeAndrieu> +1

<bigbluehat> +1

<markus_sabadello> +1

<TallTed> +1

<ottomorac> +1

<JennieM> +1

<manu> +1

<pchampin> +1

<KevinDean> +1

<ivan> +1

<smccown> +1

RESOLUTION: For the DID Resolution test suite - adopt the test suite infrastructure that was used for VC v2.0; configuration based, HTTP API driven, runs at a regular interval, generates JSON output, formatted into a human-readable report.

manu: hopefully an easy one. We're going to start small.

<ottomorac> +1 , fully agree to start small

manu: this is modest guidance to the person writing the initial tests
… Then we will decide how to flesh out the rest of the tests as things stabilize

<ottomorac> PROPOSAL: For the DID Resolution test suite, focus on a small number (~5) of end-to-end tests with at least two implementations to start.

<manu> +1

<ivan> +1

<pchampin> +1

<JoeAndrieu> +1

<ottomorac> +1

<Wip> +1

<markus_sabadello> +1

<TallTed> +1

<smccown> +1

<bigbluehat> +1

<KevinDean> +1

<JennieM> +1

RESOLUTION: For the DID Resolution test suite, focus on a small number (~5) of end-to-end tests with at least two implementations to start.

<Zakim> JoeAndrieu, you wanted to ask DDOS question

JoeAndrieu: I wanted to ask, If I have resolver that wants to work with these tests... I think we agreed that I need to have resolver up and running to respond to these tests, how to handle denial of service?

manu: I think we had this concern before, we could use a docker based approach....

manu: yes. that was a concern with VC 2.0, but nobody had a problem with that. We do have a docker option.
… that could be a way to provide a non-DDOSable way to do it.

<bigbluehat> +1

manu: Also we have authorization turned on for some of the VC services.
… So that would also be a way to do it. Oauth or ZCAPs.
… Github secrets manage the access tokens

<Wip> My connection is still a little spotty. I just wanted to flag, that we will aim to schedule another special topic call soon to make progress on the test suite work. Especially focusing on selecting the 1-5 statements


w3cbot avatar Jul 24 '25 16:07 w3cbot

WG Discussion on 24-Jul:

Agreed to the following resolutions:

  1. Make the DID Resolution test suite "DID Method agnostic" (each endpoint specifies legitimate and invalid DIDs to use against the implementation)

  2. For the DID Resolution test suite - adopt the test suite infrastructure that was used for VC v2.0; configuration based, HTTP API driven, runs at a regular interval, generates JSON output, formatted into a human-readable report.

  3. For the DID Resolution test suite, focus on a small number (~5) of end-to-end tests with at least two implementations to start.

ottomorac avatar Jul 24 '25 17:07 ottomorac

Reflecting on yesterdays call I just want to flag that did:btc1 identifier are never NOT_FOUND. See https://github.com/dcdpr/did-btc1/issues/124

So for our first fail state, we should pick something else. Probably INVALID_DID is a good one.

cc @BigBlueHat @msporny

wip-abramson avatar Aug 07 '25 14:08 wip-abramson

This was discussed during the #did meeting on 07 August 2025.

View the transcript

Debrief Special Topic Call for DID Resolution Test Suite on 6-Aug

<ottomorac> w3c/did-resolution#92

Wip: Yes the special topic call went well, Ben and Parth will be setting up the infrastructure, the plan is to keep meeting every 2 weeks on Wednesdays during the special topic call...

Wip: Starting with tests are going to check the correctness of did resolution... we expect the tests to be generic...

Wip: Also we decided that for these calls since they will be technical in nature and part of the objectives was to have knowledge transfer from DigitalBazaar team to the rest of the team....

Wip: Then we will use the infra from the CCG meaning the Google Meet and Transcription

manu: the only thing I need to do is to create a meeting in the CCG infra account for this
… Then include the chairs as people that can convene that meeting, and configure the minutes generation to trigger
… Then it will be setup automatically and will run automatically as well as sending the minutes out to the ccg

Wip: Sounds good. Please just provide me with the link and the details so that I can update our regular agenda....

Manu: Yes will do that...

manu: To send the minutes out to the DID Wg, it might be a bit more tricky to post to the mailing list
… it is possible, but a bit more work

manu: This is possible

pchampin: The infra is currently sending emails to the CG mailing list. Dont see why this would be different for the working group mailing list

manu: Yea but you can't post unless you are a member to catch spam etc

pchampin: I may need to allow the first email to go through. This is possible

<Zakim> Wip, you wanted to suggest we can just manually reference it in the meeting for the special topic call

manu: Anyone can also forward it. Not concerned about it


w3cbot avatar Aug 07 '25 16:08 w3cbot

This was discussed during the #did meeting on 28 August 2025.

View the transcript

w3c/did-resolution#92

Wip: This test suite is pointing to a place controlled by the VC test suite.
… I'm wondering whether we should create our own repo.

manu: we could do that, and I think that's what Benjamin was thinking.
… we have gotten to a state where we have 3 such repositories...
… it would be nice to get down to 1 implementers repo.
… people could contribute their implementations and opt-in to different test-suite, regardless of the WG working on them

<Wip> w3c-ccg/did-resolution-mocha-test-suite

<Wip> w3c-ccg/vc-test-suite-implementations

Wip: the did-resolution-mocha-test-suite repo, containing our test suite, is currently pointing to the vc-test-suite-implementations repo for the list of implementations

<Wip> https://github.com/w3c-ccg/vc-test-suite-implementations/blob/main/implementations/DanubeTech.json

Wip: it is strange that the latter says "VC" but maybe that's ok.

<manu> Here's the other one, to make things even more complicated: w3c/vc-test-suite-implementations :)

Wip: the JSON file above is how Danube Tech defines their implementation of the test suite

manu: I think we should get rid of the CCG repo
… and we should agree (amonst the CCG, the VC and DID WG) to put all our implementations in the same repo
… let's not fork it!

ivan: I presume that includes future WG as well...
… like the proposed DID Methods WG

manu: exactly
… that means that registered implementations must be able to indicate which test-suite their expect to pass


w3cbot avatar Aug 28 '25 15:08 w3cbot

This was discussed during the #did meeting on 18 September 2025.

View the transcript

w3c/did-resolution#92

Wip: sorry about yesterday, I didn't feel I had enough done to warrant taking time on a call.
… I have done some work on the test suite.
… I have a handful of tests, now waiting for DIDs to be passed in.

<Wip> https://github.com/w3c-ccg/vc-test-suite-implementations/blob/main/implementations/DIF.json

Wip: Are there any implemeters of resolver here (or that we know of) who could submit things?

manu: we intend to submit one, but we have a big backlog.
… We have the resolution software, only not yet the HTTP binding.

<Zakim> JoeAndrieu, you wanted to say we should try to get Shaun Conway's Ixo implementation, but I don't know if they have https

JoeAndrieu: I will try to get Sean Conway's implementation.
… I don't think their implementation is up to speed with our HTTPS approach.

Wip: anyone else we should be reaching out to?

JoeAndrieu: We should ask DCD for a BTCR2 resolver.

manu: we don't formally need to have a test suite, the requirement is to have two independent implementations for each features.

<ivan> +1 to manu

manu: We determine this. Of course a test suite is a good idea to do it.
… But that's not blocking us for going to CR.


w3cbot avatar Sep 18 '25 15:09 w3cbot