vc-data-model
vc-data-model copied to clipboard
Avoiding confusion by renaming 'credentialSubject'
In section 4.4, the term 'credentialSubject' is defined, suggesting a relation between the credential in which it is included and a 'Subject', i.e. an entity about which claims are made. However, the credentialSubject section contains a list of claims, rather than a (or more) subject(s).
This issue calls for renaming 'credentialSubject', and provides the following suggestions
- 'credentialSubjectProperties'
- 'claims'
+1 to claims
+1 to claims
+1 to claims
+1 to claims
We have been round this one many times, and the reason it was changed from claims is that the ID is not the iD of the claim, but the ID of the subject. So the object needs to have subject somewhere in its name.
@David-Chadwick, I agree that it makes sense to those of in the group to call it the credentialSubject for the reasons you mentioned. This doesn't change the fact that it is confusing for others. We say that a verifiable credential contains claims, then show a data model that (for valid, yet pedantic reasons) has a credentialSubject property. We then have to explain that that's where claims should go, and then explain why the property is not just called "claims," since that is what a verifiable credential supposedly contains.
Rather than requiring this educational moment every time someone new looks at the data model, I support the proposal to change credentialSubject back to claims.
I think there was also confusion about the embedded graph container, making the RDF subject of the claims confusing. We have examples in the focal use cases where there are multiple subjects, for example in a birth certificate and marriage certificate.
"CredentialSubject" strongly suggests that there is always single subject of the credential, but that's demonstrably untrue.
@dlongley could you provide an example of how an issuer would construct a VC with multiple subjects?
I scanned the current spec, but only found examples such as the following:
...
"credentialSubject": {
"id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
"degree": {
"type": "BachelorDegree",
"name": "<span lang='fr-CA'>Baccalauréat en musiques numériques</span>"
}
...
Am I correct that simply replacing that single object with an array works?
"credentialSubject": [{
"id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
"degree": {
"type": "BachelorDegree",
"name": "<span lang='fr-CA'>Baccalauréat en musiques numériques</span>"
},{
"id": "did:example:ebfeb1c276e12ec211f712ebc6f",
"parent": {
"id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
"type": "Mother"
}]
If that's correct, I believe @David-Chadwick's issue isn't with CredentialSubject but with the id.
The example says, in effect, that the subject of the first claim is the mother of the subject of the second claim.
In this case, I think the best we can do is explain clearly that "id" is the "id" of the subject of just the each claim. That's something we need to clarify, but my understanding is that the way JSON-LD works the "id" field is necessary. Or is it possible to change the "id" field to "claimSubject"? That's the semantic meaning in this situation. Can we make it explicit?
In any case, when you see the multi-subject "CredentialSubject", it doesn't make sense.
The way out of this dilemma is to have a claims object that contains within it one or more credentialSubject objects. Then it is clear that the VC contains claims, and that the ID is that of the subject. Eg.
"claims": [
"credentialSubject": {
"id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
"degree": {
"type": "BachelorDegree",
"name": "<span lang='fr-CA'>Baccalauréat en musiques numériques</span>"
},
"credentialSubject": {
"id": "did:example:ebfeb1c276e12ec211f712ebc6f",
"parent": {
"id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
"type": "Mother"
}
]
...
That's not bad, except I believe the "credentialSubject" is actually a "claimSubject" in your example.
I like it too. And with that context, wouldn't just "Subject" work as the name?
On Wed, Apr 3, 2019 at 7:17 AM Joe Andrieu [email protected] wrote:
That's not bad, except I believe the "credentialSubject" is actually a "claimSubject" in your example.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/w3c/vc-data-model/issues/480#issuecomment-479509031, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLkTbh3PPL8HF2VN9ODFBEYubXA5nBwks5vdLgBgaJpZM4cNFRu .
Maybe. I think folks get wrapped around the singular subject, thinking a credential can only have one subject. ClaimSubject focuses it nicely on this particular claim, gently inviting credentials that have claims about many subjects.
I will note that the conversation in this issue is all over the place and is pretty classic bike shedding during the Candidate Recommendation phase (typically, a terrible, horrible time to bike shed core properties in the specification since implementers are already busy implementing using the properties that are currently being bikeshedded in this issue).
-1 to anything plural (claims, subjects, properties, etc.). schema.org made that mistake years ago and has been busy playing whack-a-mole to remove all the plural properties. The same goes for the Web Payments specification. We have over a decade of experience now naming properties that will go back-and-forth between JSON and RDF and the best practice is to NOT use plural form.
-1 to claim (and I say this as the person that initially put that in the specification). You are expressing "one or more credential subjects". The language in the spec may lead people to a different conclusion, and if it does, we should fix that specification text (not change the property to something that it's not).
I also note that the WG has many more things to worry about at present than bike shedding a name. Can we please just drop this issue and focus on things that are a better use of the WGs time? I'm concerned that this issue is going to consume a lot of time that should be spent doing things like getting the VC extension registry up and running, working on use case finalization, etc.
@RieksJ -- perhaps there is some non-normative text that you would like added to the specification to explain why the "credentialSubject" property is named what it is named?
I appreciate that there is a lot to do. However, I don't consider heavy workloads as a valid argument for deciding to skip issues in a standardization process. I would support it as an argument to postpone the transition to a next phase, because standards should be solid. And in particular, discussions about what might seem to be details should be done carefully, because that's where the devils tend to be, and in my experience as expert in ISO SC27/JTC1, fixing standards is much harder once they're out there.
Note that this is not an argument to discuss every detail - relevance of the discussion must be shown first. The relevance of this issue is to prevent confusion and misinterpretation by readers (standards should be unambigous and clear). The term 'credentialSubject' suggests that there is a relation between the credential and a single subject (which isn't there). But even if we say that this standard uses singulars, then 'credentialSubject' still has the word 'Subject', and that doesn't cover its payload (which is a set of Claims). It is like having a sign on the door of a restroom that says 'Chair'.
To me, it is relevant that standards do not have these kinds of things in them. But if there is a consensus that preventing such confusion is irrelevant, then I have no problems with closing this issue, because that's how standardization works.
@msporny: I don't think it is up to me to elaborate on why 'credentialSubject' is named as it is, because I did not invent that name (and I wouldn't have named it that in the first place for the same reasons I created this issue).
@RieksJ,
Using a relationship named credentialSubject
makes sense in the data model, particularly if you think of the data as a graph, which is what we're modeling.
A credential is a node in this graph. If you want to represent something about this node, you create a link emanating out from the credential node that connects to another node in the graph. For example, if you want to identify the issuer of the credential, you can create a link named issuer
and use it to connect the credential node to another node that represents the issuer. If you want to say that a credential is about a subject, you can link the credential node with a link called credentialSubject
that connects to the subject node. If you want to say that the credential is about yet another subject, you create yet another link called credentialSubject
and connect that to that other subject.
You can see from this why using plural names for links doesn't make much sense. The links identify the relationship between two nodes in a graph; if you want to say one node has the same kind of relationship with two other nodes, you add two links of the same name to the graph, each connecting to one of the other nodes.
Now, if you want to say things about any one of those subjects, you repeat this process -- you link the subject node you want to say something about to another node through another relationship. For example, you could create a link called alumniOf
that connects the subject node to a node that represents a literal string value of Example University
.
In the JSON and JSON-LD syntax, links are represented as JSON keys, where the JSON key id
is a special key that is used as an identifier for a node. Nodes are represented as JSON objects or, if a node represents a literal value, it can be a string, number, boolean, etc. Other syntaxes may do something different while still being compliant with the data model.
At a high level, the entire graph is therefore comprised of sentences that include a "subject" (a node), a "property" (a link), and an "object" (another node/literal value). These sentences are the "claims" -- and in our model they are understood to have been made by the entity identified as the "issuer".
We have a picture of a graph showing how this works in the data model:
Section 4.4 Credential Subject says "This specification defines a credentialSubject property for the expression of claims about one or more subjects". Since JSON-LD does not allow duplicate keys, then what does a credential look like (in JSON-LD) that has claims (or better: JSON-LD graphs) about two different subjects? And might the figure need clarification so that it better shows how multiple claims on different subjects are done?
@RieksJ,
Since JSON-LD does not allow duplicate keys, then what does a credential look like (in JSON-LD) that has claims (or better: JSON-LD graphs) about two different subjects?
Instead of using a single object for the value of credentialSubject
an array of objects is used:
{
"@context": [
"https://www.w3.org/2018/credentials/v1",
"https://www.w3.org/2018/credentials/examples/v1"
],
"id": "http://example.com/credentials/4643",
"type": ["VerifiableCredential"],
"issuer": "https://example.com/issuers/14",
"issuanceDate": "2018-02-24T05:28:04Z",
"credentialSubject": [{
"id": "did:example:abcdef1234567",
"name": "Jane Doe"
}, {
"id": "did:example:3d5c623bf63156cb1",
"name": "John Doe"
}],
"proof": { ... }
}
And might the figure need clarification so that it better shows how multiple claims on different subjects are done?
+1. If you could propose some concrete text that helps clarify things for you we'll be able to pull it in much more quickly. And if you or someone else has the time to add another picture showing multiple credentialSubject
links that would also be great.
W.r.t. the figure: I've tried to do some quick stuff with svg, but I've trouble getting it done. I propose the following changes (decreasing importance):
- draw a box around the yellow boxes; this box is similar to that of 'credential graph' and 'proof graph' and could be named 'claims'.
- draw this box several times at a slightly different place, thus suggesting that multiple such claims can be referended
- add some more yellow boxes such that it becomes clear that there is a linked data graph here.
W.r.t. texts: the comments above by various people indicate that the texts that describe what a credential, a subject, a claim, etc. is, is far from clear, even for people that are consistent contributers to the text. While the issue I raised was just a part of this discussion, it seems to me that a decision is called for to determine whether to revise the text so as to address these issues, or leave it as it is. While it is my preference to do the first, I don't make these decisions.
Decision on VCWG call 7 May 2019: RESOLUTION: The Working Group has discussed issue #480 and is not willing to make a substantive change to the specification that would trigger another Candidate Recommendation phase. The Working Group is interested in exploring non-normative resolutions to the issue. The WG would like to defer the issue so it can be considered when work continues beyond VC 1.0.
We discussed this on the maintainence working group call and believe that this PR can be closed due to the impact on the number of implementations already done today. If the author believes this should still be addressed it can be handled in V2 and they can reopen it.
The issue was discussed in a meeting on 2021-08-11
- no resolutions were taken
View the transcript
4.6. Avoiding confusion by renaming 'credentialSubject' (issue vc-data-model#480)
See github issue #480.
Wayne Chang: this would add a huge breaking change, renaming a major component
Manu Sporny: +1 this would be a huge breaking change.
Dave Longley: +1 to close
Manu Sporny: +1 to closing
Wayne Chang: probably good to close, but we'd need to see broad support for changes like this in the new working group to re-open
I am disappointed by the way this issue has been treated and has come to a close, as it might have been properly dealt with around the time it was raised. Instead, it has been lying around for over two years, thereby implicitly instructing all implementations to follow the contested practice. Allowing this issue to remain unaddressed caused it to become a 'huge breaking change', and continues to do so up to at least v2. It is like instructing people to litter a place and then refuse to participate in cleaning it up, suggesting that you can have another shot at it when you build a new place (v2). I consider this a bad practice for a standardization group.
I am disappointed by the way this issue has been treated and has come to a close, as it might have been properly dealt with around the time it was raised.
@RieksJ, while I can understand your frustration, the reality is that the group did discuss the issue at depth: namely, in issue https://github.com/w3c/vc-data-model/issues/207
Then, when you raised this issue (during the Candidate Recommendation phase), the group did debate it again and came to the conclusion that it did not want to make the change you were suggesting: https://github.com/w3c/vc-data-model/issues/480#issuecomment-490116412
Since then, there has been hardly any discussion on the issue, which signals to the group that it has not been a concern for people building solutions using Verifiable Credentials.
While you are free to be disappointed with the outcome, the concepts that you raised in the issue did get very broad discussion in the group and the group did come to consensus on the path forward.
Fundamentally, the flaw with the argument in this issue is this: "the credentialSubject section contains a list of claims, rather than a (or more) subject(s)."
The credentialSubject
section does, in fact, provide an associated list of credential subjects (that's why the name was picked), identified explicitly by credentialSubject.id
or implicitly by an auto-generated identifier (blank node identifier). This is why the credentialSubject
property was where the group ended up. We then hang claims off of each credential subject identifier. I personally would've preferred something else, but that's neither here nor there... credentialSubject
is what achieved group consensus.
RE: ...or implicitly by an auto-generated identifier (blank node identifier).
@msporny Where is this implied or, better, stated?
@rieksj @mspony A major part of the root cause of this issue appears earlier in the specification with a series of statements about Claims that are, generally and fundamentally, not true.
See https://github.com/w3c/vc-data-model/issues/790
RE: ...or implicitly by an auto-generated identifier (blank node identifier). @msporny Where is this implied or, better, stated?
https://www.w3.org/TR/json-ld11/#node-identifiers https://www.w3.org/TR/json-ld11/#identifying-blank-nodes https://www.w3.org/TR/json-ld11-api/#node-map-generation https://www.w3.org/TR/json-ld11-api/#generate-blank-node-identifier
A major part of the root cause of this issue appears earlier in the specification with a series of statements about Claims that are, generally and fundamentally, not true.
There is a flaw in that assumption that I've documented here: https://github.com/w3c/vc-data-model/issues/790#issuecomment-897710929
RE: ...or implicitly by an auto-generated identifier (blank node identifier). @msporny Where is this implied or, better, stated?
https://www.w3.org/TR/json-ld11/#node-identifiers https://www.w3.org/TR/json-ld11/#identifying-blank-nodes https://www.w3.org/TR/json-ld11-api/#node-map-generation https://www.w3.org/TR/json-ld11-api/#generate-blank-node-identifier
Thank you @msporny for the interesting links about the mechanics of JSON-LD.
Where in the VC Data Model specification is the connection made between credentialSubject id being optional and, say, for example, https://www.w3.org/TR/json-ld11/#identifying-blank-nodes? Where in the VC data model specification does it say how a non-existent elements like credentialSubject id is first detected and then subsequently identified as a blank node, etc. etc.
@msporny the issue wasn't dealt with in #207. Rather, the resolution of #207 caused the issue to be raised, and subsequent comments suggest there is/was support for it to be actually resolved. While the decision to defer the resolution of the issue appears to have been discussed, there is no mention of the motivation, which I can live with but reflects a way of doing things that in my opinion could be improved. The subsequent decision to close the issue is a 'maintenance decision' rather than a reflection on the actual content. While this might be appropriate for a repo of software code, but I consider it inappropriate for a repo of standardization issues.
As there is opposition to closing this issue, I will re-open it.
As a resolution to this issue lies beyond the scope of the current working group, I am going to label it defer-v2
@mspony Where in the VC data model specification does it say how a non-existent elements like credentialSubject id is first detected and then subsequently identified as a blank node, etc. etc. [I don't believe past issues are an official part of the specification. The specification needs to stand on its own, doesn't it?
Also where in the VC data model specification tied to JSON-LD and where/how is an intelligent reader supposed to deduce this? Most people will not even get to section 6: https://www.w3.org/TR/vc-data-model/#json-ld
JSON-LD, if applicable, needs to be introduced at the beginning of the specification and incorporated into the explanations of what a Claim is and what a Credential is because JSON-LD introduces requirements are unnatural and will be unexpected for most intelligent readers. The linkages need to be made clear.