How to handle hostnames with ports?
Now that we support paths (as of PR #5) in our did:web URLs, and we're using the : to encode the / characters for paths, this poses another challenge.
Since we're using the : character for paths, what do we do about the actual most common intended purpose of that character, which is to specify a port number?
So, specifically, say I am a developer who has just fired up a test server on their local machine, which is running on https://localhost:8443 (the https is deliberate of course - I made a self-signed cert for it and everything). This kind of thing happens all the time (it's happening to me right now :) ).
If there's a did:web document residing on that domain (say in /.well-known/did.json), what will that URL look like? According to our rules so far:
did:web:localhost:8443
Except now we're using : (in the did-specific-identifier portion of the url) to encode path fragments. So that URL would "decode" to https://localhost/8443. Not what we want.
So, what are people's thoughts on how to best handle this? @awoie, @OR13 ?
retry logic I suppose.
The other option we have is - we can require the hostname portion to be URL-encoded.
So, https://localhost:8443 would encode as did:web:localhost%3A8443.
^ much better idea.
Do we have a timeline for a resolution of this issue? As far as I can see this prevents actual adoption in a real life scenario.
@felixwatts you could just use a URL instead of a DID, and return a did document with the same URL everywhere the DID would be.
@felixwatts - apologies, I did the thing where I implemented the proposed solution and thought I updated the spec but didn't. Will be making a PR shortly.
@dmitrizagidulin is your PR arriving any time soon? Could you explain the proposed solution?
I guess the proposed solution is base64url-encoding the host portion as per https://github.com/w3c-ccg/did-method-web/issues/7#issuecomment-623485232 and what @felixwatts did.
We are going to adopt the base64url-encoding proposal in aries-framework-go.
@llorllale sure. I'll make the PR today.
The proposed solution is - URL-encoding (as in, encodeUriComponent) both the host portion, and each path portion.
So, the test vectors would be:
did:web:localhost%3A8080->https://localhost:8080/.well-known/did.jsonhttps://example.com/path/some+subpath->did:web:example.com:path:some%2Bsubpath
@llorllale -1 to base64url-encoding did:web URLs, though. (Since base64url-encoding removes one of the nice properties of did:web DIDs, which is, readability / recognition of the domain name.)
So for example, https://localhost:8080 would base64url-encode as did:web:bG9jYWxob3N0OjgwODA=, which is opaque to human eyes.
@dmitrizagidulin
@llorllale -1 to
base64url-encoding did:web URLs, though. (Sincebase64url-encoding removes one of the nice properties of did:web DIDs, which is, readability / recognition of the domain name.)So for example,
https://localhost:8080would base64url-encode asdid:web:bG9jYWxob3N0OjgwODA=, which is opaque to human eyes.
Fully agree - I mixed them up this morning before coffee somehow.
@llorllale sure. I'll make the PR today.
The proposed solution is - URL-encoding (as in,
encodeUriComponent) both the host portion, and each path portion. So, the test vectors would be:1. `did:web:localhost%3A8080` `->` `https://localhost:8080/.well-known/did.json` 2. `https://example.com/path/some+subpath` `->` `did:web:example.com:path:some%2Bsubpath`
+1
@dmitrizagidulin @OR13 I just realized url-encoding the path components results in a non-compliant DID as per the syntax: https://www.w3.org/TR/did-core/#did-syntax
Nevermind: https://tools.ietf.org/html/rfc3986#section-2.4
So in summary, if I understand it right:
- parse (split) the
didprefix, the method name, and the method-specific ID - url-decode the method-specific ID*
- should implementations barf if a
%is still present after decoding? - method-specific ID =>
s/:/\// - HTTP GET
https://+ previous result
URI recommendations: https://www.w3.org/Addressing/URL/4_URI_Recommentations.html
The percent sign ("%", ASCII 25 hex) is used as the escape character in the encoding scheme and is never allowed for anything else.
Some test vectors for percent-encoding: https://www.w3.org/2004/04/uri-rel-test.html#reg-percent
@llorllale Hi,
I just realized url-encoding the path components results in a non-compliant DID as per the syntax: https://www.w3.org/TR/did-core/#did-syntax
Nevermind: https://tools.ietf.org/html/rfc3986#section-2.4
can you clarify please if we should incorporate encodeUrI into implementation or not?
It sounds like did web does not currently support encodeUrI or ports.... and folks should assume that remains true until this issue is closed after the spec is updated.
yeh, incorporating ngrok to use did:web in development. Seems like the easiest way to be compliant in development
hah, nice i <3 ngrok.... thats an awesome idea.
I'm not sure if this should be a separate issue or not, but it's related to encoding 😅. The spec does not mention how to deal with non-ASCII domain names.
punycode is an option for that, but it won't cover the port issue and is not as easily available as encodeUriComponent.
However, the did-core spec does not allow the % character in the method-specific-id which is currently a blocker for encodeUriComponent
@mirceanis non ascii domain names cannot be DIDs.... if the spec needs to be updated to support them we need URL safe bidirectional transformations.
Update: The DID Core spec is being updated to accept % characters as part of the DID URI ABNF, as of PR https://github.com/w3c/did-core/pull/703.
So, we'll go with percent url encoding, since that is now valid.
Updated the text for this in #38
https://example.com/foo:bar is also an interesting case, since : is a valid path character in URIs. Would that become did:web:example.com:foo%3Abar? What if a URL already has encoded characters, are they decoded first and then re-encoded per path segment? For example, https://example.com/foo%3Abar -> https://example.com/foo:bar -> did:web:example.com:foo%3Abar. In this case, what URL does that DID refer to? https://example.com/foo:bar or https://example.com/foo%3Abar?
@letmaik - good point, about : characters being allowed in the path segment of URLs. I'll update the proposal with that in mind.
imo, did:web:example.com/foo:bar -> https://example.com/foo:bar/did.json... no problem here.
imo,
did:web:example.com/foo:bar->https://example.com/foo:bar/did.json... no problem here.
That's a DID URL, not a DID. I think this has to be solved for the DID itself, right?
yes, that example was a did url with a path that contained a colon.
today:
The method specific identifier MUST match the common name used in the SSL/TLS certificate, and it MUST NOT include IP addresses or port numbers.
if we wanted to allow the identifier to use ports:
did:web:localhost%3A3000 -> https://localhost:3000/did.json
and for completness, here is a did url that uses ports and colons in paths:
did:web:example.com%3A1337/foo:bar -> https://example.com:1337/foo:bar/did.json
yes, that example was a did url with a path that contained a colon.
today:
The method specific identifier MUST match the common name used in the SSL/TLS certificate, and it MUST NOT include IP addresses or port numbers.
But what follows after that is important:
Directories and subdirectories MAY optionally be included, delimited by colons rather than slashes.
This applies to DIDs, not just DID URLs. Or are you suggesting to remove that part?