vc-data-model
vc-data-model copied to clipboard
Can a credentialSubject be only a string value?
All examples within the VC test suite and in the examples within the specification show the credentialSubject as an object. However there is no normative language which specifies that this property MUST be an object. Is this a case where a normative statement was missed?
The reason for asking is because we've found an edge case where a frame is modifying the credential subject from an object to a string with the value being the value of the @id property and we're uncertain what the most interoperable path would be for this.
Checking with editors here: @msporny @dlongley @brentzundel @burnburn
However there is no normative language which specifies that this property MUST be an object. Is this a case where a normative statement was missed?
It is probably an oversight, yes.
To be clear, having a URL as the value of credentialSubject in JSON-LD is completely legal -- but really weird, probably not having any practical value. It effectively states: There exists a verifiable credential for the credentialSubject identified by this URL. I have no idea what sort of use case you might want to solve with that information. :)
I expect that you're not asking the frame to embed the credential subject, maybe? If you could work up an example in the JSON-LD playground, that might help us debug what's going on: https://json-ld.org/playground/
Figured out the issue was that we were using the @explicit property incorrectly.
When using a frame like this:
{
"@context": [
"https://www.w3.org/2018/credentials/v1",
"https://schema.org",
"https://w3c-ccg.github.io/ldp-bbs2020/contexts/v1"
],
"credentialSubject": {
"@explicit": true
},
"type": "VerifiableCredential"
}
we end up producing a credentialSubject value that is the id represented as a string of the credentialSubject object. By changing the explicit property to false when it's the only property it properly reveals as expected (all properties).
In any case, I'm thinking we may want to add a normative statement that updates this, but not certain of the bureaucratic hurdles we'd encounter if we did so. I'll defer to others in this maintainence WG to help me figure out what we should do about it and I can make the updates to it once we agree to a path.
but not certain of the bureaucratic hurdles we'd encounter if we did so
Welp... I found myself in the middle of managing these "bureaucratic hurdles" now 😆
Thinking this should be a substantive change addressed in the V1.1 spec by the WG to align the test suite with the specification. Going to label it as such to triage this work at least.
Reminder, substantive changes that are in scope for the maintenance group should be labeled v1.2, per https://github.com/w3c/vc-data-model#process-overview-for-vc-data-model-pull-requests
@msporny I think I may have come up with a legitimate reason for this structure.
Imagine issuing a KYC Credential like the following:
{
"@context": [
"https://www.w3.org/2018/credentials/v1",
"https://www.w3.org/2018/credentials/examples/v1"
],
"id": "http://example.edu/credentials/3732",
"type": ["VerifiableCredential", "KYCCredential"],
"credentialSubject": {
"id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
"firstName": "Joe",
"lastName": "Smith",
"homeBranchAddress": "1 Bank Street",
},
"proof": {...}
}
This feature could be used to selectively disclose proof of a KYC credential without needing to reveal additional claims like so:
{
"@context": [
"https://www.w3.org/2018/credentials/v1",
"https://www.w3.org/2018/credentials/examples/v1"
],
"id": "http://example.edu/credentials/3732",
"type": ["VerifiableCredential", "KYCCredential"],
"credentialSubject": "did:example:ebfeb1f712ebc6f1c276e12ec21",
},
"proof": {...}
}
This would allow for holder/subject identification based on the id property in the original credential without needing to reveal all of the attributes. In the example, that would be like saying "I can prove I possess a valid credential which was issued to me, but you (the verifier) don't get to see any additional details".
In this case, I think we may just need to update clarity on it's purpose and meaning as well as update the test cases in the test suite to support it if we decide this is beneficial to leave as is.
Selectively disclose for proof of a KYC credential without needing to reveal additional claims can also be indicated like this:
{
"@context": [
"https://www.w3.org/2018/credentials/v1",
"https://www.w3.org/2018/credentials/examples/v1"
],
"id": "http://example.edu/credentials/3732",
"type": ["VerifiableCredential", "KYCCredential"],
"credentialSubject": {
"id":"did:example:ebfeb1f712ebc6f1c276e12ec21"}
},
"proof": {...}
}
This is how I would expect it to be displayed, meaning it is still an object
Note. We have the same issue with the issuer property, which is causing a bit of a hassle in the presentation exchange spec, since two paths are needed to specify who the issuer is, depending upon whether it is encoded as a string or as an object.
On further reading of the Recommendation I believe there is normative text which states that credentialSubject MUST be an object viz "The value of the credentialSubject property is defined as a set of objects" copied from section 4.4 which is normative
On further reading of the Recommendation I believe there is normative text which states that credentialSubject MUST be an object viz "The value of the credentialSubject property is defined as a set of objects" copied from section 4.4 which is normative
Interesting, I hadn't picked up on that language. It's rather ambiguous as to what it's stating at this point. E.g. is a "set of objects" actually:
[
{
"prop": "value"
},
{
"prop": "value"
}
]
I think we'll want to clarify this statement a bit further.
Note. We have the same issue with the issuer property, which is causing a bit of a hassle in the presentation exchange spec, since two paths are needed to specify who the issuer is, depending upon whether it is encoded as a string or as an object.
I agree having a single code path here is much simpler. Let me see if I can find a way to get framing to behave properly so that it can handle it. The original concern here was that we were using the jsonld.frame() api and it was outputting this credential which was producing interoperability issues for us. I wasn't sure if it was an issue on our end or the specs end so I wanted to raise the issue.
The issue was discussed in a meeting on 2021-09-08
- no resolutions were taken
View the transcript
5.4. Can a credentialSubject be only a string value? (issue vc-data-model#762)
See github issue #762.
Brent Zundel: Next one, 762. Can a credentialSubject be only a string.
… Right now, no normative language says it must be an object. Could it just be a string?
Dave Longley: A string in JavaScript is also an object.
Brent Zundel: Yes, so no big deal.
Wayne Chang: I'll comment on the thread and see if he responds
David Chadwick: No comment needed, the value of a credential subject is defined as a set of objects. So I think there is no change needed - certainly not 1.2.
… The spec categorically states the value is defined as a set of objects... that's pretty concrete - in section 4.4
Wayne Chang: I retract that statement. In IETF, they are defined with curly braces, but in JavaScript they are an object
David Chadwick: I think it's clear, the credentialSubject is a set of objects. We have examples, a marriage certificate... has a set of subjects
Brent Zundel: I think it could be... either an object or an array of objects.
… Perhaps a PR here could clarify it in an editorial way.
David Chadwick: I think this is a 1.1.
Dave Longley: +1 to a 1.1 change if its to say its an object or a set of objects
Brent Zundel: Alright, I'm going to change the label.
David Chadwick: All we need to do is clarify it can be a string value
Brent Zundel: ... Having that sentence would clarify.
No objections to deferring this to v2 work in the WG when this was discussed. Relabeling it for now
The issue was discussed in a meeting on 2021-10-27
- no resolutions were taken
View the transcript
2.6. Can a credentialSubject be only a string value? (issue vc-data-model#762)
See github issue vc-data-model#762.
Kyle Den Hartog: no PR addressed to this, WG agreed it was editorial, was deprioritized.
… but could be deferred, which would be my leaning here.
Brent Zundel: any objections to labelling v2?.
Juan Caballero: (none).
the current spec is ambiguous regarding this, i believe it should be updated to state:
vc.crentialSubject: String | Object (with id)
I believe the current spec also allows for multiple subjects, implying vc.crentialSubject: String | Object (with id) | Array<String | Object (with id)>... this case should be forbidden, unless there is a really good reason for allowing it.
I think it would be slightly preferable to stick with only object, i.e. not allow string. If we allow string, then I believe that string would have to be an IRI (because of the JSON-LD context). And because of that, it might be easier to understand if we have an object with an "id" property, as shown in https://github.com/w3c/vc-data-model/issues/762#issuecomment-906201414.
A credentialSubject is really two parts: an optional id plus one or more claims. It's best not to conflate the two very separate concepts or reduce them to a single value.
A credentialSubject is really two parts: an optional id plus one or more claims. It's best not to conflate the two very separate concepts or reduce them to a single value.
A better answer is: the end solution (where/when/how to use a single DID Identifier string as the value of a credentialSubject attribute) needs IMO to be symmetric/compatible with the syntax used to embed an entire child VC as the value (or subproperty) of a credentialSubject. The latter is a requirement for some of my use cases. I don't know what the solution is (yet).
I prefer issuer, subject, holder to all have a consistent shape.
credentialSubject is currently an outlier since it can be an array... but you can't issue from an array or present from an array.
Here are somethings I would like to see allowed or disallowed using normative language:
issuer: { id, name}
issuer: { id }
issuer: id
issuer: [ { id, name }, { id, name } ]
issuer: [ { id }, { id } ]
issuer: [ id, id ]
holder: { id, name}
holder: { id }
holder: id
holder: [ { id, name }, { id, name } ]
holder: [ { id }, { id } ]
holder: [ id, id ]
credentialSubject: { id, name}
credentialSubject: { id }
credentialSubject: id
credentialSubject: [ { id, name }, { id, name } ]
credentialSubject: [ { id }, { id } ]
credentialSubject: [ id, id ]
"issue / present from multiple" is a real use case... see also https://help.twitter.com/en/using-twitter/cotweets
The issue was discussed in a meeting on 2022-08-17
- no resolutions were taken
View the transcript
2.4. Can a credentialSubject be only a string value? (issue vc-data-model#762)
See github issue vc-data-model#762.
Brent Zundel: credential subject is currently an object or array of objects. Can it be a simple string?.
Manu Sporny: what is the use case for having a subject as a URL.
Oliver Terbu: +1 manu.
David Chadwick: At an earlier time, we discussed this being an email address, possibly... verifier could send PIN code, wallet user could return PIN code, that's proof of possession... there were alternatives, telephone number, sent secret to phone number..
… It doesn't have to be a DID..
Manu Sporny: we can allow alternative PoP schemes today through the id in the subject object..
… so do not see any value in altering the text today.
Kerrie Lemoie: prefer to keep it as an object.
Orie Steele: credentialSubject and Issuer and Holder should be aligned. Currently they are not.
… subject can be an array but the others cannot be.
Manu Sporny: hrm, disagree, because each field has a slightly different purpose.
David Waite: agree with Orie that we should make these objects more consistent.
Manu Sporny: like, having an issuer that is a DID is totally fine... whereas doing the same for credentialSubject is problematic..
A credential states 3 core things (besides metadata about the credential itself):
- something claimed
- by someone (in the broad sense, not necessarily a person)
- about someone (broad sense too)
So to be structurally sound, there should be 3 attributes at the same level. Something like:
| Core data | Current location | Ideal root-level location |
|---|---|---|
| Something claimed | credentialSubject.* (except .id) |
claim |
| by someone | issuer |
issuer |
| about someone | credentialSubject.id |
credentialSubject |
In other words, credentialSubject is currently giving inconsistent semantics to its id attribute (the actual subject) and to the rest of attributes (the claims). That is the root issue, and any added complexity that doesn't address that will only make things worse.
As for type consistency between issuer and subject, the straight path would be to make both of them a simple identifier, i.e. a URI (because all we need from them is to be identifiable), and move the actual claims to a new claim or equivalent attribute.
Example:
{
"issuer": "somescheme:identifierA",
"credentialSubject": "anyscheme:identifierX",
"claim": [
"favouriteHaircut": "Short",
"birthDate": "2000-01-01"
]
}
If we do that:
- We get consistency.
- The question of this thread becomes equivalent to "Can
claimbe missing?", for which the answer is simply: of course, if you want to be saying nothing about the subject.
example credential:
{
"@context": [
"https://www.w3.org/2018/credentials/v1",
"https://w3id.org/security/suites/jws-2020/v1",
{
"@vocab": "https://example.com#"
}
],
"id": "urn:uuid:3d3d2fad-89ff-4808-b5b7-e1c132c69ad2",
"type": [
"VerifiableCredential"
],
"issuer": "did:key:zQ3shrnCZq3R7vLvDeWQFnxz5HMKqP9JoiMonzYJB4TGYnftL",
"issuanceDate": "2010-01-01T19:23:24Z",
"credentialSubject": {
"id": "did:example:123",
"name": "bob",
"favoriteColor": "blue"
},
"proof": {
"type": "JsonWebSignature2020",
"created": "2022-08-18T19:42:10Z",
"verificationMethod": "did:key:zQ3shrnCZq3R7vLvDeWQFnxz5HMKqP9JoiMonzYJB4TGYnftL#zQ3shrnCZq3R7vLvDeWQFnxz5HMKqP9JoiMonzYJB4TGYnftL",
"proofPurpose": "assertionMethod",
"jws": "eyJhbGciOiJFUzI1NksiLCJiNjQiOmZhbHNlLCJjcml0IjpbImI2NCJdfQ.._qAMdcCyHz97sWLkJ4tp4bNhVY_Yf-1FrEt9wfUleglQcxK76TR4ZJzsmM3iBk7UATSuVBef_zAFAO2Q6Si7Hw"
}
}
The RDF for it is here
<did:example:123> <https://example.com#favoriteColor> "blue" .
<did:example:123> <https://example.com#name> "bob" .
<urn:uuid:3d3d2fad-89ff-4808-b5b7-e1c132c69ad2> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://www.w3.org/2018/credentials#VerifiableCredential> .
<urn:uuid:3d3d2fad-89ff-4808-b5b7-e1c132c69ad2> <https://w3id.org/security#proof> _:c14n1 .
<urn:uuid:3d3d2fad-89ff-4808-b5b7-e1c132c69ad2> <https://www.w3.org/2018/credentials#credentialSubject> <did:example:123> .
<urn:uuid:3d3d2fad-89ff-4808-b5b7-e1c132c69ad2> <https://www.w3.org/2018/credentials#issuanceDate> "2010-01-01T19:23:24Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> .
<urn:uuid:3d3d2fad-89ff-4808-b5b7-e1c132c69ad2> <https://www.w3.org/2018/credentials#issuer> <did:key:zQ3shrnCZq3R7vLvDeWQFnxz5HMKqP9JoiMonzYJB4TGYnftL> .
_:c14n0 <http://purl.org/dc/terms/created> "2022-08-18T19:42:10Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> _:c14n1 .
_:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/security#JsonWebSignature2020> _:c14n1 .
_:c14n0 <https://w3id.org/security#jws> "eyJhbGciOiJFUzI1NksiLCJiNjQiOmZhbHNlLCJjcml0IjpbImI2NCJdfQ.._qAMdcCyHz97sWLkJ4tp4bNhVY_Yf-1FrEt9wfUleglQcxK76TR4ZJzsmM3iBk7UATSuVBef_zAFAO2Q6Si7Hw" _:c14n1 .
_:c14n0 <https://w3id.org/security#proofPurpose> <https://w3id.org/security#assertionMethod> _:c14n1 .
_:c14n0 <https://w3id.org/security#verificationMethod> <did:key:zQ3shrnCZq3R7vLvDeWQFnxz5HMKqP9JoiMonzYJB4TGYnftL#zQ3shrnCZq3R7vLvDeWQFnxz5HMKqP9JoiMonzYJB4TGYnftL> _:c14n1 .
Notice the subject predicate object tripple, for the claims about the subject:
<did:example:123> <https://example.com#favoriteColor> "blue" .
<did:example:123> <https://example.com#name> "bob" .
@davux,
If we do that:
- We get consistency.
If we do what you're suggesting we actually lose the overall consistency we have across the whole data model. Right now, the data model maps very cleanly to a graph of information. Each JSON object represents a subject (a "node" in the graph), every JSON key in that object represents a property / attribute of the subject (an "edge" in the graph) that points to either a literal value or another subject (or node) in the graph (with id expressing the subject's identifier). This is true across the whole data model and extensions are expected to follow the pattern, allowing the whole model to be traversed and structurally understood (i.e., "claims" are always expressed as subject-property-value relationships (see the claims section of the spec).
This pattern allows VC authors to create deeply nested and fully-expressive sets of claims about a variety of subjects stemming from the root credential subject whilst keeping inline with the core data model, e.g., you can express that person X has a university transcript with a number of completed classes and so on -- or product Y was created by company Z based on materials A, B, and C produced in a supply chain with other linked VCs D, E, and F. This sort of data can even be organized and displayed or combined with other information according to the aforementioned consistent relationship model without knowing every detail of some otherwise idiosyncratic expression of claims.
We should do a better job in the 2.0 work figuring out how to highlight this strength.
Another way of putting this is: If this kind of holistic, consistent, rich expression of data was not seen as necessary, everyone could have just used JWTs with a flat claim model for their use cases like you proposed. Instead, there was a desire for more than that in the core data model to address the variety of use cases that JWTs (alone) haven't been addressing to date. Note: You can wrap the core model in a JWT and by putting a credential into a single flat "vc" claim in order to provide integrity (there are pros and cons to this approach vs. using Data Integrity proofs), but you've still got the expressiveness of the consistent VC core data model once you get to the value of that JWT claim.
Thanks @dlongley, I hadn't thought of the nodes as having standalone value and being composable with one another. If we took the id out of the node, that would stop working completely. Thanks for the nice explanation.
PROPOSAL: Define a JSON Schema for credential subject that makes it clear, it MUST be either an Array or Object.
Implement the schema in normative language in the core data model spec.
The issue was discussed in a meeting on 2022-10-19
- no resolutions were taken
View the transcript
3.3. Can a credentialSubject be only a string value? (issue vc-data-model#762)
See github issue vc-data-model#762.
David Chadwick: this issue is whether a subject can only be a string value.
Manu Sporny: No we should not allow subject to be only a string value.
Oliver Terbu: +1 manu.
The specification currently states that credentialSubject is a set of objects. It says nothing about strings, so the spec is fairly clear on this point, but could be more clear.
A PR that provides a JSON Schema for a Verifiable Credential that makes this more clear would be helpful, and that is being handled in #934.
To resolve this issue, we need a PR that: States clearly that strings are not allowed.
I believe the test cases for this would need to be updated to. The way this came up was that the BBS+ signature suite was outputting a string when the frame was done improperly so I wasn't certain if that should be an error case or not. When I looked into the test cases and other implementations it wasn't very clear what the acceptable behavior hence the question.
@msporny I believe we can close this issue since we merged PR #970 was merged.
Agreed. This issue has been addressed via PR #970 and #976. Closing.
+1 I think this clarifies this and I agree with the outcome here