specification Inconsistency in identifiers for apps: WebIDs and ClientIDs

An app in Solid needs to have an identifier. The spec states that "WebIDs [...] are used as the primary identifier for users and applications." (Solid Protocol, Section 9.1).

However, currently, apps are not only identified by their WebID, but also by their ClientID.

This leads to inconsistencies that hinder implementation. Examples are:

The Interop spec defines an endpoint for Agent Registration Discovery (Solid Interop, Section 7.1.2) that should be able to lookup a registration for an agent based on the user's access token. This token contains the ClientID of the agent whilst the registries of the interop spec contain the WebID of the application
There are two documents that describe an app (i.e. the WebID document and the ClientID document) and those documents contain duplicate information that is defined by different terms (e.g. interop:applicationName and oidc:clientName).
...

Considering that the Solid spec states that WebIDs "should be the primary identifier for users and applications", it seems logical to continue with the WebID of an app and drop the ClientID of an app.

One solution would be to merge the information in both documents into one (WebID) document and serialise this information as desired using content negotiation.

However, in this solution, there would be a number of additional problems:

Solid and in particular Solid OIDC and the WebID spec say nothing about Accept headers
This would still not solve the issue of different terms to describe the same information

Given the importance of keeping the specification consistent and the use of identifiers within Solid, I hope we can tackle this issue as fast as possible.

Oct 08 '22 13:10 tomhgmns

There is a related issue in Solid-OIDC https://github.com/solid/solid-oidc/issues/95

My preference would be to just provide oidc vocab and revert the specific JSON-LD requirement for ClientID Documents.

There are two documents that describe an app (i.e. the WebID document and the ClientID document) and those documents contain duplicate information that is defined by different terms (e.g. interop:applicationName and oidc:clientName).

I don't think there was ever an intention to have two different documents. Solid-OIDC started with clients simply using WebIDs. Which later were replaced with ClientID and ClientID Document terminology.

Duplicate predicates in the interop namespace would be most likely removed. And we should pick one consistent set of predicated, be it from oidc namespace or whatever.

Oct 08 '22 13:10 elf-pavlik

Agree with picking one - no strong opinion on which - more important to just have one that's definitive.

Oct 08 '22 14:10 justinwb

Agreed. The client_id and webid of an App should be the same URI, as was probably meant that way, and is semantically quite obvious. It then dereferences to a single document, which is both the WebID Profile and the Solid-OICD ClientID document. ~~Together with aligning the vocabulary,~~ [T]his solves a number of problems and is semantically more elegant.

EDIT: I'm not sure picking one vocabulary is a good thing here. Interop vocab should indicate data (e.g. labels) useable by Apps/Agents following that spec; Solid-OIDC vocab is meant to indicate data necessary for (Solid) OIDC flows to work. By having both live together in one document, we provide a stable foundation for Interop operations, while also allowing App developpers to switch out (or add) authentication methods.

Oct 08 '22 14:10 woutermont

On the most urgent matter of the WebID/ClientID Document format, we have a few options:

Have the WebID spec change it to JSON-LD by default.
Have the Solid-OIDC spec change theirs to Turtle by default.
Have the WebID spec mandate a Accept: text/turtle header.
Have the Solid-OIDC spec mandate a Accept: application/ld+json header.

Personally, I don't have a strong preference. A consideration in favour of Turtle could be that it seems to be somewhat of a standard in many specs and existing applications. On the other hand, keeping JSON-LD as the default keeps up the hope of becoming compatible with OIDC (e.g. with it's Federation's Entity Statements).

Oct 08 '22 15:10 woutermont

Solid and in particular Solid OIDC and the WebID spec say nothing about Accept headers

To keep specs orthogonal, I think this should remain this way. As long as they say that RDF MUST be returned, any standard RDF serialization should do. This will also avoid the problem of every (Solid) spec having to repeat statements about RDF serializations, which may end up conflicting.

See also this long thread on mandatory versus recommended RDF serializations in the WebID CG: https://github.com/w3c/WebID/issues/3

Oct 08 '22 16:10 rubensworks

It is important not to conflate things.

Firstly, WebIDs can and are used to identify applications. Applications that use this mechanism will also need to be capable of managing private keys and issuing access tokens. In other words, these applications do not use the OAuth2 authorization_code flow. This is the case presently for many bots, scripts and other embedded devices that rely on flows other than authorization_code.

Second, for applications that do use the authorization_code flow, the principal agent is identified with a WebID. This is typically an individual piloting an app. That agent is also delegating access to a particular app and making use of an authorization system that is defined outside of the Solid specification, namely: OAuth2 and OpenID Connect. There is an important distinction between the agent identified with a WebID and the entity defined with a client_id. Calling both of those identifiers a WebID would create a different set of problems.

Either way, in order to work with OAuth2 and OpenID Connect, the Solid-OIDC specification uses a particular form of compact JSON-LD. This allows the Solid-OIDC specification to make use of and align with the OIDC Dynamic Registration specification. The use of a JSON flavor for serializing this document is unlikely to change. The use of JSON-LD was a compromise in order to fit both the Solid and the OpenID models, but leaving JSON altogether would break conformance with OpenID Connect: and adopting the WebID terminology for that does precisely this. There is some discussion of changing the document format of Client Identifiers to align with the draft OpenID Federation specification, but that would be JSON, not even JSON-LD and certainly not Turtle. Expecting the document format of OpenID application registrations to move away from JSON is mistaken, at least if we plan to continue to use OpenID Connect.

Because of the requirements of the WebID draft specification, it is not possible to call these Client Identifier URIs "WebIDs". And requiring these documents to be a non-JSON format conflicts with OpenID.

It is also important to note that there are many types of identifiers. WebIDs are one such type. Solid does, in fact, build significantly on WebIDs, but at no point does the Solid Protocol specification state that WebIDs are the only identifier that can be used for agents and applications. Some people in Solid are already using DIDs.

While I can appreciate that having multiple types of identifiers can introduce complexity, this is a feature not a bug of decentralized ecosystems. It is also a feature of an ecosystem that looks to the future.

Oct 08 '22 20:10 acoburn

FYI in https://github.com/solid/solid-oidc/issues/95#issuecomment-1272326625 I'm bringing up the possibility to use client_id exclusively for OIDC and following mentioned by Aaron OpenID Federation spec. While having a distinct application website for a broader solid ecosystem.

That agent is also delegating access to a particular app and making use of an authorization system that is defined outside of the Solid specification, namely: OAuth2 and OpenID Connect.

Solid Application Interoperability introduces a base of an authorization system that allows end-user to authorize applications.

There is an important distinction between the agent identified with a WebID and the entity defined with a client_id. Calling both of those identifiers a WebID would create a different set of problems.

IMO we can denote both the end-user and the application (client) with WebID or DID or something else. We should be more careful not to use the term WebID without a context of what it denotes.

I think we need to organize all the requirements discussed here and look for proposals that take all of them into account. The mentioned separation between client_id and application webid could be one of the directions to explore. It might be useful to have a working document where we work on this problem. Conversations can become hard to work with once they grow longer.

Oct 08 '22 23:10 elf-pavlik

There is an important distinction between the agent identified with a WebID and the entity defined with a client_id. Calling both of those identifiers a WebID would create a different set of problems.

I think I'm missing some background. It would help me if you could be a bit more precise in your statement I've quoted above so I can have an answer on the following questions:

What is the semantic distinction between the agent identified with a WebID and the entity defined with a client_id in case this agent is an application?
Why is this distinction important?
Could you give a few examples of the real problems that would arise if an application's WebID and client_id would be the same? With real problems, I mean issues that are not fixable or are not about trade-offs (e.g. not supporting traditional OIDC or changing the draft WebID spec)

Oct 10 '22 08:10 tomhgmns

To keep specs orthogonal, I think this should remain this way. As long as they say that RDF MUST be returned, any standard RDF serialization should do. This will also avoid the problem of every (Solid) spec having to repeat statements about RDF serializations, which may end up conflicting.

I agree with this point.

Be that as it may, let me rephrase my concern: Solid and in particular Solid OIDC and the WebID spec say nothing about Accept headers even though Solid-OIDC requires the use of JSON-LD and the WebID spec implies the use of Turtle.

If both specs would allow any RDF serialisation, there wouldn't be a problem ;-)

Oct 10 '22 08:10 tomhgmns

While I can appreciate that having multiple types of identifiers can introduce complexity, this is a feature not a bug of decentralized ecosystems. It is also a feature of an ecosystem that looks to the future.

This seems like a very strong statement without real references or examples. Could you give some examples of decentralised ecosystems in which a plethora of different identifiers have helped their growth?

Awaiting your response, I can give some counter examples of your statement:

The email system only allows one type of identifier: an email address
Identity systems based on blockchain only allow one type of identifier: a DID
Linked data requires not just normal URIs, but HTTP URIs (Source)
...

Some people in Solid are already using DIDs.

You've provided a link to a draft document about DIDs in Solid while I actually expected a link to a list of people that already used Solid with DIDs... To the best of my knowledge, no one actually uses Solid with DIDs in a production environment, but I hope that you can give me a counter example :-)

(Also, can we please stay on topic? This discussion is about using two identifiers of the same type, i.e. HTTP URIs for a concept that is - on first sight - semantically equivalent. You are talking about using two different types of identifiers - DIDs and HTTP URIs.)

Oct 10 '22 08:10 tomhgmns

Solid and in particular Solid OIDC and the WebID spec say nothing about Accept headers even though Solid-OIDC requires the use of JSON-LD and the WebID spec implies the use of Turtle.

If both specs would allow any RDF serialisation, there wouldn't be a problem ;-)

That's a problem indeed (related to #454).

I just created #465 to tackle this specific problem.

Oct 10 '22 08:10 rubensworks

Anyhow, if we would decide against to use one identifier for an app, it would help if we could include a reference to the other in each of the documents.

By that I mean that:

The Client ID document contains a reference to the WebID and
The WebID document contains a reference to the Client ID

By doing this, we would know, for example, based on the WebID of an app, which redirect URIs are permitted.

Oct 10 '22 09:10 tomhgmns

@acoburn

Thanks for the short recap for people who want to follow, but I don't see where we are conflating things.

WebIDs can and are used to identify applications. [...] [T]hese applications do not use the OAuth2 authorization_code flow.

WebIDs can indeed be used for that purpose; we do so ourselves. However, I see no reason why WebID-identified applications could not use the authorization_code flow. In fact, if applications need to have an ID dereferenceable to a profile document (e.g. like in the Interoperability draft), a WebID seems like a good fit.

There is an important distinction between the [principal] agent identified with a WebID and the entity defined with a client_id. Calling both of those identifiers a WebID would create a different set of problems.

You are right in pointing out the obvious distinction between the (social) agent using the client and the agent that is the client itself. However, I don't see why both could not be identified by (different) WebIDs. Again, the client is an agent that needs to be identified, and about which we want to declare some statements in a profile document.

[Solid-OIDC] uses a particular form of compact JSON-LD [...] [to] align with the OIDC Dynamic Registration specification [and maybe the OpenID Federation specification]. The use of a JSON flavor for serializing this document is unlikely to change. [...] [L]eaving JSON altogether would break conformance with OpenID Connect: and adopting the WebID terminology for that does precisely this. [Because of this] it is not possible to call these Client Identifier URIs "WebIDs".

I'm not sure why you are holding that against us. It is exactly this problem we want to raise once more, as you did so yourself a year ago in https://github.com/w3c/WebID/issues/3 (before that issue sidetracked in a too general discussion about serialisations).

If we presume that an application needs [A] an identifier that dereferences to a profile document containing Interop data, and [B] an identifier that dereferences to a document containing OIDC data, then why on earth would we have to create two different URI's pointing towards two different resources? (And likewise with any two other ID documents.) This holds even stronger when we realise that a lot of the information in both needs to be the same. Stubbornly keeping a specification that makes it more complex for developers is a bad practice, everybody knows that.

It is also important to note that there are many types of identifiers [and] at no point does the Solid Protocol specification state that WebIDs are the only identifier that can be used for agents and applications. Some people in Solid are already using DIDs.

Definitely true, and rightly so. Now let's learn a bit about orthogonality from the DID's take on representations, and kick those default obligations out of the WebID and DID:Solid specs!

It isn't even hard; it's not like we're asking anyone to give up their preferred syntaxes. Just don't demand them on default requests:

WebID requires that servers must at least be able to provide Turtle representation of profile documents, when requested with text/turtle as the prefered format in the Accept header.

For clarity we can then possibly add:

WebID requires that servers must at least be able to provide a (i.e. any) W3C-recommended RDF-representation of profile documents, when requested without an Accept header or with an Accept header containing no acceptable formats.

Oct 14 '22 14:10 woutermont

If we presume that an application needs [A] an identifier that dereferences to a profile document containing Interop data, and [B] an identifier that dereferences to a document containing OIDC data, then why on earth would we have to create two different URI's pointing towards two different resources? (And likewise with any two other ID documents.) This holds even stronger when we realize that a lot of the information in both needs to be the same.

In https://github.com/solid/solid-oidc/issues/199 and https://github.com/solid/solid-oidc/issues/95#issuecomment-1250087497 we discuss aligning the OIDC specific parts with example of Metadata with RP's Entity Configuration from OpenID Connect Federation 1.0 - draft 22

If we do that, having two different IRI to identify the application in two different contexts might actually help us to meet different requirements of those two different contexts:

client_id - specific to OIDC interactions following the OpenID Connect Federation 1.0 requirements
solid.appWebId - used in the broad Solid context.

While we are at it we could also adjust claim in the ID Token to have matching solid.userWebID. As ecosystem evolve we could generalize further into solid.appDID and solid.userDID. (@woutermont I think your suggested something in those lines in https://github.com/solid/solid-oidc/issues/26)

Oct 14 '22 18:10 elf-pavlik

Just a kind reminder for the editors (i.e. @timbl @justinwb @dmitrizagidulin @kjetilk @csarven and @RubenVerborgh) that we keep running into this issue.

It would be great if this issue could have some more visibility so that we can come to a decision.

Oct 31 '22 11:10 tomhgmns

Leading up to 0.9, we operated with milestones and project boards and a process to nominate and get issues on the board. Since I left Inrupt, I have not had the capacity to be a driving force for keeping the process going, but I believe we need to have that kind of structure and probably stick more closely to it too. At least, we can tag it.

Oct 31 '22 13:10 kjetilk

@acoburn what do you think about my suggestion above https://github.com/solid/specification/issues/463#issuecomment-1279315991

Oct 31 '22 13:10 elf-pavlik

@elf-pavlik, personally I'm not a big fan of putting claims under a 'solid' object.

Also, if we use the federation entity statements, the clientid can just remain the app's WebID (which is what Tom is advocating for).

I therefore think that this issue would be solved without further adjustments, when we make that step.

Oct 31 '22 14:10 woutermont

Some considerations in light of the discussion during the CG Weekly.

About the issue

There seems to be a huge amount of misunderstanding in this discussion, exemplified by the following statement.

[@acoburn] [WebIDs and ClientIDs] are different semantically. [...] [A] WebID defines an agent (person), and then you go down the chain of delegation. [...] [A] client_id is identifying the client, not the user. If I am a user (person) I’d have a client which would be identified separately.

This is true, but it is besides the question. This issue is NOT about the WebID of the person using an app versus the ClientID of the app itself. It is about the WebID of the app versus the ClientID of the app. Both are identifiers of the very same entity (the app), so the question whether or not these can be the same IRI is completely justified.

The above is a theoretical, semantic question, of course, but @tomhgmns brings up a good practical consequence: if an app's WebID is different from its ClientID, a request containing a token with the WebID of the person using an app end the ClientID of the app itself (e.g. as in Solid-OIDC) cannot be used to retrieve data from the app's WebID Document (e.g. as in SAI).

About solution

We could indeed mandate mutual references in each document, as proposed by @elf-pavlik, but this seems just a hack when we're actually talking about the same entity. In the same line, we could put every piece of data from different specs in different resources, and let them all point to one another.

A more interesting option becomes available in the light of integration with OIDC Federation. @acoburn says the following:

[With OIDC Federation] we can remove the whole client id from solid-oidc and use oidc federation. [...] If that happens that is pure json. So I encourage to keep separate, conflating is trouble in long term.

While I gladly agree with the perspective this brings, I believe the conclusion here (my emphasis) is misguided.

In the context of OIDC Federation Entity Discovery with Automatic Registration, which is what will subsume the current OIDC Dynamic Client Registration flow, the Client ID of an app will be its Entity Identifier (a globally unique URI). Such a URI has no specific restrictions except that the Entity Configuration (a self-issued Entity Statement) must be available at that URI's .well-known/openid-federation endpoint.

We can thus perfectly envision an app with WebID / Entity Identifier / ClientID https://example.app, serving an RDF description on its base URL and a JSON Entity Configuration on https://example.app/.well-known/openid-federation.

Nov 02 '22 16:11 woutermont

Also there are semantic differences between client-id and webid of an app.

WebId being identifier for the agent, whether being as an app or person, that identity is unique name for the agent in entire world of discource on par with any other agent's name.

Where as Client-id is meaningful only for oidc clients, and are tied-against, and also often issued-by single oidc provider in it's own context., And hence they are not the names an agent self-declare for themselves for representing them on open web. But are oidc-provider specific identifiers often issued and controlled by providers themselves, typically bound with assigned secrets for sole purpose of identification in their system. Thus client-id for an app against one oidc-provider mostly will not be relevent against other provider, forget about openweb. They represent contextual contract between both provider and client(with their own TOC signed, possibly) that can be unilaterally invalidated by provider. where as web-id is self declared, self-controlled, global identifier.

Nov 02 '22 17:11 damooo

Also one single app with name Image transcriber app can be registered with multiple oidc poviders. (Ascan be seen in current practice, sign-in-with google/github/twitter), each system assigning different client-id, secret pairs to same app identified by same literal name, and home page. They assign such an id after accepting their TOC. And can invalidate, if there is any misbehaviour.

WebId is just replacement for literal name of the app, that can be proven to be controlled by claimer.

Thus one app with a single literal-name/uri-name(web-id) can have multiple contracts with multiple providers, and thus multiple contractual identifiers. One-to-Many.

Nov 02 '22 17:11 damooo

@elf-pavlik I very much agree with https://github.com/solid/specification/issues/463#issuecomment-1279315991

I would also emphasize that, given that we are using the OpenID Connect framework for authentication, there are several distinct conceptual entities:

Relying Party (RP) - OAuth 2.0 Client application requiring End-User Authentication and Claims from an OpenID Provider
End-User - Human participant

In this context, the Relying Party is identified with a client_id and therefore is the Client Identifier. The End User is the entity with the webid. This entity may also be a bot -- i.e. a non-Human participant.

What is being discussed here is merging these two roles into a single role where the End User and the Relying Party are the same logical entity and are identified with a single URI.

It is possible to conflate these two roles into a single URI (and it is entirely possible to do this today in the context of the current Solid-OIDC specification), but this is not something I would recommend. These are distinct roles, and as such, I would highly recommend identifying those roles as independent entities.

Nov 02 '22 18:11 acoburn

What is being discussed here is merging these two roles into a single role where the End User and the Relying Party are the same logical entity and are identified with a single URI.

I think this is part of the misunderstanding. The discussion is only about URI(s) denoting the application/client, this discussion doesn't involve at all separate URI which denotes the End User.

Nov 02 '22 18:11 elf-pavlik

Perhaps a bit of logic can help. A WebID identifies an Agent. It can identify a Person, in which case we expect the triple

<#i> a foaf:Person .

to appear inside the Personal Profile Document. The type foaf:Person can be thought of as setting the identity criteria of the object over time.

A WebID can identify a software instance in which case we should have a URL for the class of Apps, and the following triple inside the App document.

<#myCal> a solid:App

where

solid:App rdfs:subClassOf foaf:Agent

If one were to have an ISBN for an App then it should probably be a subclass of solid:App since there can be many instances of it.

Following that thinking we see that both sides of the argument are right:

App IDs and personal IDs are both WebIDs
They are distinct subsets of the set of WebIDs.

Nov 02 '22 18:11 bblfish

To me, it seems that there are two reasonable solutions here, using the various specifications in their current form:

Use separate identifiers, one to align with WebID and the other to align with Solid-OIDC
Use the same identifier but make sure that the corresponding resource can be content-negotiated to produce both valid Turtle (to align with the draft WebID spec) and JSON-LD (to align with the Solid-OIDC spec)

Nov 02 '22 19:11 acoburn

In today's meeting @woutermont made an interesting point that in the gov case they were working on they found the distinction between the Agent ID and the App Id to be unecessary.

Would that mean that they want something like this:

<#x> a foaf:Person, solid:App .

I.e. an identifier that both refers to a Person and an solid App? That seems a bit weird, mostly because people have different identity criteria to apps. Apps have versions, change over time, are tested, can be shipped over the internet in binary form, ... Perhaps I misunderstood.

Nov 02 '22 19:11 bblfish

@RubenVerborgh, we're only going in circles because people get stuck on the same misinterpretation of this issue as @acoburn and @bblfish, which both I and @elf-pavlik have tried to clear up: we do NOT want to say that a Person is an App. We are talking about assigning an App a WebID (cf. SAI) and a ClientID (cf. Solid-OIDC), and whether these two IDs can be the same URI.

The "issue" to me is semantic of nature, even though it has clearly presented practical effects: we cannot discover the one IRI via the other. "Tackling it" means: resolving the semantic friction and making that discovery possible.

To answer to @damooo, who is the only one who seems to have understood the core of the issue: a WebID and a ClientID (in the Solid-OIDC or OpenID Federation sense) are both globally unique identifiers. A WebID has semantics in the context of the WebID spec, a ClientID has semantics in the context of the (Solid-)OIDC (Federation) spec. Those respective semantics are:

an HTTP URI which refers to an Agent and dereferences to a document describing the Agent;
a URI that is globally unique and that is bound to one Entity.

These semantics are not incompatible; it is perfectly possible to have a 'ClientWebID': "a HTTP URI that is globally unique, that is bound to one Entity, which refers to an Agent and dereferences to a document describing the Agent." An Identifier adhering to that description would have meaning in both specs.

Since in reality there is only a single Entity to which both the WebID and the ClientID correspond, it is also possible to assign this ClientWebID to it, and use it in both contexts. The only thing preventing us from doing that today is a bad interpretation of the WebID spec; but this will no longer be a problem once we incorporate OpenID Federation.

EDIT: Shoutout to https://github.com/solid/solid-oidc/issues/207, which will solve this 👍

Nov 02 '22 19:11 woutermont

These semantics are not incompatible; it is perfectly possible to have a 'ClientWebID': "a HTTP URI that is globally unique, that is bound to one Entity, which refers to an Agent and dereferences to a document describing the Agent." An Identifier adhering to that description would have meaning in both specs.

Ok, I think I understood you @woutermont , I was just trying to make sure above to exclude the more bizzare interpretation.

Perhaps can put it logically too. Let us say that we have a oidc:AppId and a :ClientWebID. You are claiming these overlap or may even be identical concepts, thought to be distinct only for reasons of syntax of the documents. Ie that we can have <#app> be a WebID, that is both those subtypes simultaneously.

<#app> a :ClientWebID, oidc:AppId .

That actually seems quite reasonable to me, and indeed it was considered by @acoburn quite early on, since he was arguing to allow WebIDs to have turtle representations for just that reason https://github.com/w3c/WebID/issues/3 .

Nov 02 '22 19:11 bblfish

indeed it was considered by @acoburn quite early on, since he was arguing to allow WebIDs to have turtle representations for just that reason https://github.com/w3c/WebID/issues/3 .

Exactly 😄 That's why I find it so bizar that he keeps involving the User in the discussion.

With regards to the logical notation, I don't think that you can put it that way, at least not in RDF: the identifier (<#app>) stands for the Entity it identifies, not for itself as an Identifier; so <#app> a :ClientWebID . actually claims that the App is a ClientWebID. You would need to be able to "parenthesize" the term <#app> to be able to talk about it on a meta-level: Meta(<#app>) a :ClientWebID .

Nov 02 '22 20:11 woutermont

With regards to the logical notation, I don't think that you can put it that way, at least not in RDF: the identifier (<#app>) stands for the Entity it identifies, not for itself as an Identifier; so <#app> a :ClientWebID . actually claims that the App is a ClientWebID. You would need to be able to "parenthesize" the term <#app> to be able to talk about it on a meta-level: Meta(<#app>) a :ClientWebID .

I agree. I just quickly grasped for those terms. But yes, something like solid:App and oidc:Client would have been better terms to use. Here <#app> is the WebID - in relative form, as we assume that the context in which it is dereferenced will resolve it to a full absolute URL. It is a WebID because it identifies an agent, but we can be more precise about what subtype of agent it identifies.

<#app> a solid:App , oidc:Client .

Do we have official terms for any of those types yet?

Nov 02 '22 20:11 bblfish