vc-data-model icon indicating copy to clipboard operation
vc-data-model copied to clipboard

`proof` in `@context` and the use of `@container`

Open OR13 opened this issue 3 years ago • 30 comments

I've been using Neo4j a lot lately.

One of my favorite features is the ability to preview (framed) JSON-LD.

For example:

CALL n10s.rdf.preview.inline(
'
    {
        "@type": "https://schema.org/Organization",
        "https://schema.org/description": "Realigned maximized alliance",
        "https://schema.org/name": "Bartell Inc "
    }
', 'JSON-LD')

For simple cases this works fine... but when I attempt to apply this to spec compliant verifiable credentials, I get a weird blank node issue with the proof block.

Here is a picture of what I mean:

Screen Shot 2022-06-15 at 1 40 12 PM

Notice the 2 blank nodes that separate these disjoint subgraphs.

I believe this is caused by the way the proof block is defined in the v1 context:

https://github.com/w3c/vc-data-model/blob/v1.1/contexts/credentials/v1#L45

"proof": {"@id": "sec:proof", "@type": "@id", "@container": "@graph"},

This is a lot of complexity... for one of the most important term definitions the standard provides.

I believe this is also the cause of the "double blank node" issue, I observed above.

I think what happens is that a first blank node is created for the proof, and since that node has @container @graph, instead of being able to trace the relationships directly from credential to proof to verification method...

Each proof is being treated as a disjoint subgraph, and the relationship is not being preserved during preview / import...

This is really not ideal, since I am interested in querying changes in these proofs over time for credentials, and that relationship is not being imported.

I suspect this is solvable with a more complicated graph config: https://neo4j.com/labs/neosemantics/4.0/config/

But I wonder if we might correct this behavior in VC Data Model 2.0, such that RDF representations don't have this odd behavior when imported as labeled property graphs.

Anyone know how to solve this?

OR13 avatar Jun 15 '22 19:06 OR13

Relevant sections of the JSON-LD TR:

  • https://www.w3.org/TR/json-ld11/

implicitly named graph A named graph created from the value of a map entry having an expanded term definition where @container is set to @graph.

https://www.w3.org/TR/json-ld11/#graph-containers

When expanded, these become simple graph objects.

^ pretty sure this is the culprit... it means that if you expand a credential, you loose the relationship between the credential and its proof.

OR13 avatar Jun 15 '22 19:06 OR13

"proof": {"@id": "sec:proof", "@type": "@id", "@container": "@graph"},

This states that proof will be contained in a separate graph than the default graph. RDF Dataset Canonicalization does this to separate the data you're signing (which is in the default graph) from the proof data (which is in a different graph). Both graphs together constitute an RDF Dataset and both items are signed over when generating a Data Integrity signature . We did this to ensure that the signature graph didn't pollute the "data being signed" graph.

URGNA2012 (Universal RDF Graph Normalization Algorithm 2012) didn't do this as it only dealt with RDF Graphs, not RDF Datasets, and so we just shoved all the RDF signature data into the default graph (and some people were rightfully upset by that).

When the RDF 1.1 work expanded to include RDF Datasets (part of the driver there was to support concepts that JSON-LD supported but the core RDF data model at the time didn't support), we separated the "data to be signed" from the "signature information" to ensure a cleaner separation between the two types of data. That became the URDNA2015 (Universal RDF Dataset Canonicalization Algorithm 2015).

Hopefully the benefits of this architectural separation between original data and signature data are clear... if they're not, I'm happy to try and elaborate on how jumbling "data to be signed" with "the signature" leads to dirty data over time, especially when you shove it into / take it out of graph databases.

As for what neo4j is doing there... you might ask them how they link statements between RDF Graphs in an RDF Dataset... might just be a limitation on their tooling. The JSON-LD Playground doesn't seem to suffer from the same limitation.

msporny avatar Jun 16 '22 00:06 msporny

@OR13 can you give the JSONLD you use to make that neo4j graph? I'll try it in GraphDB.

VladimirAlexiev avatar Jun 16 '22 11:06 VladimirAlexiev

Here is the example: https://v.jsld.org/2MSdzhMeTHEeoaL1RUjCULnetM3fLnCvckiBMek9DfNYKke6yYGn3ZFL51tmaXqH4VtVkQCqKUy8GCtorQzxd1Y1Na6oLdgR9pKW4oRoMAEj3dXSRn2c8rCkyfPXJxwNfUMfezVvcCAUVB2BGr6GQYouQZH7tYzb8cq9qk2DGQYtDtbyMFBTSHVxCFEzHRhAmtrNzoi9g8mo4vrooak8uCCWkBENYiHzFpLjNdP3rv3m34nd5CGvaMWxKSahsFV4tauACbDEnLXqpAuVJf2ti6U7pxkeYXEQXAGAuhZYBrCoS81FizGFYkYN3sGEfTrQCZFhw1qycxzRDhVdot8L1A1EXA8xiLsaq7CgiWfNrSJbbCjHqA83wGoxi4wFRABsDCxpDTcfaKQHJcedH9vg99VE9V4Jn5v6598U3Lkp9SCuiXwo2sCNqqyuuAcPPDk3sVZD7C62ZgiAHvMHikZDs9EuuxLNzSzTZQV8JD3pkN8Vtz1MEnDK1iQvbT1PoLTRdnrKCUHkMct97gWkGEyiNL5vCZqUTzwkofiiSJkQrMUPoYuotwk6nMK6T67RGAF8qpqFLFoAij8orduXm71xPr9JR92bD1v5YSPjPEW1FwFxpvtfRQ8nxACKKMRc8NANHCwX2ZPULHR4pFN7q8M3ngCw2fpxLJpsk2ZpDJzFAFBpy8PckSG6wZ87QWdW6R7qGn2DuC1o3VmdAKWTG2ERRUo7aYcAvYND2m4qExsCExDbQZACyvqQH9izNeyKkD29srshnacVSPjXd9DQnTCucw2vNZvzTYtoPLEoBa1SzqMbrnQ57XtnzGMMYafY9HyRG98YG4bDSVXDBVckWDcu1gS8gmjsdyTPwWUHGnqSG3yeHwNuu5vJT2MUkyEA1mV8YzpCdD85Fe2PdhqBA

OR13 avatar Jun 16 '22 12:06 OR13

@msporny thanks, I figured that is what was happening.

  • uri: bnode://genid-95f535e97a5e46e58b107eb7ce611f1c-b1
  • uri: bnode://genid-95f535e97a5e46e58b107eb7ce611f1c-b6

^ these are the URIs that neo4j assigns to the blank nodes (based on a default graph config):

CREATE CONSTRAINT n10s_unique_uri ON (r:Resource)
ASSERT r.uri IS UNIQUE
...
CALL n10s.graphconfig.init({
handleVocabUris: 'MAP'
})

... so it is possible to query over the gap, between the graphs, you just have to do some string magic.

OR13 avatar Jun 16 '22 12:06 OR13

these are the URIs that neo4j assigns to the blank nodes

Hrm, that feels a bit weird. It looks like they're sharing some part of the bnode ID space, but then tacking something on at the end (-b1 and -b6) to give them different IDs. We'd have to talk with their core engineering team to understand why they decided to do it that way vs. just use a universal bnode space for graph names in a dataset.

re: https://v.jsld.org/ -- that's a neat visualization tool :)

Note that the text/x-nquads output shares the same namespace for blank nodes (_:c14n1) and graph names (_:c14n0), so it's possible to do that, neo4j just decided to not do it that way.

msporny avatar Jun 16 '22 13:06 msporny

^ exactly, I suspect that with an updated graph config in neo4j the link would be imported as _:c14n0 -> _:c14n1, but its not clear what the edge should be... I think most folks would expect that edge to exist when importing a credential.

OR13 avatar Jun 16 '22 15:06 OR13

@msporny

the data you're signing (which is in the default graph)

I fear I've missed something important along the way...

Are you saying that, in RDF Dataset Canonicalization, "the data being signed" is always in the default graph, and not in a named graph?

This is (or will be) problematic for systems (such as Virtuoso) where the default graph is the union of all named graphs (plus, at least in Virtuoso's case, a special not-really-named graph which is populated by inserts that do not specify a target named graph)...

Further, in such systems, this re-blurs the lines between "the data being signed" and "the proof data", as the named graph containing the latter is included in the default graph containing the former -- i.e., the default graph contains both the "data being signed" and "the proof data"...

TallTed avatar Jun 17 '22 15:06 TallTed

@TallTed,

Are you saying that, in RDF Dataset Canonicalization, "the data being signed" is always in the default graph, and not in a named graph?

No, this is unrelated to RDF Dataset Canonicalization.

As for Data Integrity proofs, the above separation of concerns and process may have been better described by just saying that a proof always exists in its own named graph so as to isolate it from other data.

So, whenever you create a proof (when using proof sets as opposed to proof chains), you remove any existing proof named graphs from the default graph, then sign the entire (canonicalized) dataset, then add back the existing proof named graphs and add the new proof named graph that represents the new proof to the default graph.

Does this clarify?

dlongley avatar Jun 17 '22 15:06 dlongley

@dlongley --

So, whenever you create a proof (when using proof sets as opposed to proof chains), you remove any existing proof named graphs from the default graph, then sign the entire (canonicalized) dataset, then add back the existing proof named graphs and add the new proof named graph that represents the new proof to the default graph.

"The default graph" seems not to be the correct label for all of the above instances, and even if it were, in Virtuoso (for instance), you cannot "remove any existing proof named graphs from the default graph" unless you are dropping those "existing proof named graphs" from the quad store, because all existing named graphs are part of the default graph (except when specific SPARQL clauses are used to change the definition of the default graph for that query, which does not appear to be part of the process you're describing).

TallTed avatar Jun 17 '22 15:06 TallTed

@dlongley,

Sorry to potentially add to the confusion. I think I follow but want to check (this also feels like we're diverging into a separate topic so I can take this elsewhere if you want):

whenever you create a proof (when using proof sets as opposed to proof chains), you remove any existing proof named graphs from the default graph, then sign the entire (canonicalized) dataset, then add back the existing proof named graphs and add the new proof named graph that represents the new proof to the default graph.

If the proof graph(s) are always decoupled during signing, then the metadata about the signature generation is not part of the signature? So, if I were to somehow gain control over the DID or become a middleman for DID resolution, then I could theoretically introduce an illegitimate signing key and alter or issue VCs for that controller to work with my illegitimate private key? » I'm sure I must have that wrong somewhere.

👇🏻 indeed

you cannot "remove any existing proof named graphs from the default graph" unless you are dropping those "existing proof named graphs" from the quad store,

sbutterfield avatar Jun 17 '22 15:06 sbutterfield

@TallTed,

"The default graph" seems not to be the correct label for all of the above instances, and even if it were, in Virtuoso (for instance), you cannot "remove any existing proof named graphs from the default graph" unless you are dropping those "existing proof named graphs" from the quad store, because all existing named graphs are part of the default graph (except when specific SPARQL clauses are used to change the definition of the default graph for that query, which does not appear to be part of the process you're describing).

+1 for finding better terminology to avoid confusion as needed.

EDIT: I presume you could implement the above using a specific SPARQL query as you mentioned (to "change the definition of the default graph") if you need to interact with the data that way via a quad store (as opposed to in memory).

dlongley avatar Jun 17 '22 15:06 dlongley

@sbutterfield,

If the proof graph(s) are always decoupled during signing, then the metadata about the signature generation is not part of the signature?

I think responding to individual concerns without a comprehensive response (i.e., what the spec says or should say) on the entire process is leading to more confusion here. But at risk of introducing more confusion in just responding to your particular query, a Data Integrity proof involves signing over a hash of both the canonicalized dataset (with any existing proofs in the default graph removed when using "proof sets") and over a hash of the canonicalized meta data for the new proof. In other words, all data is signed except for the signature itself (which is not logically possible to sign over since it is an output of the process).

So, if I were to somehow gain control over the DID or become a middleman for DID resolution, then I could theoretically introduce an illegitimate signing key and alter or issue VCs for that controller to work with my illegitimate private key?

The above should clarify that the answer to this is: "No".

dlongley avatar Jun 17 '22 16:06 dlongley

@dlongley, thank you. That's how I originally had thought about it. Crystal clear now.

sbutterfield avatar Jun 17 '22 16:06 sbutterfield

@dlongley --

a Data Integrity proof involves signing over a hash of both the canonicalized dataset (with any existing proofs in the default graph removed when using "proof sets") and over a hash of the canonicalized meta data for the new proof.

Still trying to parse this... It appears that the "both" is misplaced in the sentence and/or the "over a hash of both" is missing one of the things being hashed. Maybe --

a Data Integrity proof involves signing both over a hash of the canonicalized dataset (with any existing proofs in the default graph removed when using "proof sets") and over a hash of the canonicalized meta data for the new proof.

-- or --

a Data Integrity proof involves signing over both a hash of the canonicalized dataset (with any existing proofs in the default graph removed when using "proof sets") and a hash of the canonicalized meta data for the new proof.

-- or --

a Data Integrity proof involves signing over a hash of both the canonicalized dataset (with any existing proofs in the default graph removed when using "proof sets") and the canonicalized meta data for the new proof.

-- or something I'm not seeing yet...

TallTed avatar Jun 17 '22 16:06 TallTed

@TallTed,

The canonicalized meta data is hashed producing hash1. The canonicalized dataset (with any existing proofs in the default graph removed when using "proof sets") is hashed producing hash2. The signature is over the concatenation hash1 + hash2.

dlongley avatar Jun 17 '22 17:06 dlongley

AFAIK, the "Data Integrity Proofs" or what used to be called "Linked Data Proofs" have not changed in this regard since 2017...

Here is an example where I tested them against Mastodon:

(Mastodon is the original web5, get on my level haters).

OR13 avatar Jun 17 '22 21:06 OR13

URGNA2012 (Universal RDF Graph Normalization Algorithm 2012) didn't do this as it only dealt with RDF Graphs, not RDF Datasets, and so we just shoved all the RDF signature data into the default graph (and some people were rightfully upset by that).

I was also working on LD signatures back then when the signatures/proofs still used to be in the same graph as the data, and I remember it felt like the right decision to move the signatures/proofs into their own named graphs as it is now.

peacekeeper avatar Jun 18 '22 10:06 peacekeeper

@OR13 The example doesn't parse in rdf4j, probably because it doesn't yet support JSON-LD 1.1: https://github.com/eclipse/rdf4j/issues/3654

Jena 4.4.0 2022-01-30 also gave error

$ riot --validate test.jsonld
ERROR riot            :: invalid term definition: 1.1
$ riot --version
Jena:       VERSION: 4.4.0
Jena:       BUILD_DATE: T15:09:41Z
  • but jena 4.5.0 (2022-05-01) works ok:
$ riot --formatted trig test.jsonld                                                                                                                                    
@prefix : <https://ontology.example/vocab/#> .                                                                                                                         
                                                                                                                                                                       
<urn:uuid:4x7fzuuv>  a  :CertifiedDevice , <https://www.w3.org/2018/credentials#VerifiableCredential> ;                                                                
        <https://w3id.org/security#proof>                                                                                                                              
                _:b0 ;                                                                                                                                                 
        <https://www.w3.org/2018/credentials#credentialSubject>                                                                                                        
                <did:key:z6MkoZrhfUbGsBFVqawVgyauvoTA8bsNJWyaAQeVkJYdpvXK> ;                                                                                           
        <https://www.w3.org/2018/credentials#issuanceDate>                                                                                                             
                "2022-01-15T19:25:55.574Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;                                                                              
        <https://www.w3.org/2018/credentials#issuer>                                                                                                                   
                <did:key:z6MkoZrhfUbGsBFVqawVgyauvoTA8bsNJWyaAQeVkJYdpvXK> .                                                                                           
                                                                                                                                                                       
<did:key:z6MkoZrhfUbGsBFVqawVgyauvoTA8bsNJWyaAQeVkJYdpvXK>                                                                                                             
        a             :Device ;                                                                                                                                        
        :description  "Try to quantify the SAS alarm, maybe it will copy the virtual panel!" ;                                                                         
        :ip           "21a0:7698:a2bd:ae26:1331:085a:238a:d13d" ;                                                                                                      
        :latitude     "69.4264" ;                                                                                                                                      
        :longitude    "-136.3105" ;                                                                                                                                    
        :mac          "cd:22:26:65:6a:9b" ;                                                                                                                            
        :name         "55S Mobile Program" .                                                                                                                           
                                                                                                                                                                       
_:b0 {                                                                                                                                                                 
    [ a       <https://w3id.org/security#JsonWebSignature2020> ;                                                                                                       
      <http://purl.org/dc/terms/created>                                                                                                                               
              "2022-01-15T19:25:55Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> ;                                                                                    
      <https://w3id.org/security#jws>                                                                                                                                  
              "eyJhbGciOiJFZERTQSIsImI2NCI6ZmFsc2UsImNyaXQiOlsiYjY0Il19..IkEae_ErY_g3-g43665vgn0KkI_A4Ww_hlDWlL0MlWVy9cddewQFT_TGFeFsqtJREf_OiNyI4ALf5oom1aPcDg" ;     
      <https://w3id.org/security#proofPurpose>                                                                                                                         
              <https://w3id.org/security#assertionMethod> ;                                                                                                            
      <https://w3id.org/security#verificationMethod>                                                                                                                   
              <did:key:z6MkoZrhfUbGsBFVqawVgyauvoTA8bsNJWyaAQeVkJYdpvXK#z6MkoZrhfUbGsBFVqawVgyauvoTA8bsNJWyaAQeVkJYdpvXK>                                              
    ] .                                                                                                                                                                
}                                                                                                                                                                      

@TallTed should we post an issue to SPARQL 1.2 "FROM should allow the exclusion of graphs"?

Maybe no, because to fulfill the goal "separate the data you're signing", a repository would store the VC in a named graph: storing hundreds or millions of VCs in the default graph would not allow you to separate them.

VladimirAlexiev avatar Jun 20 '22 06:06 VladimirAlexiev

@VladimirAlexiev -- I think there are some scenarios where a NOT FROM could be useful, but I don't think signing scenarios are among them. I don't think I have a strong enough handle on an example scenario of this sort to make the case for NOT FROM in the SPARQL 1.2 wishlist, but if you do, I encourage you to add it soon, as action on items in that wishlist may be taken at any time.

TallTed avatar Jun 20 '22 15:06 TallTed

Seems related: https://github.com/search?q=org%3Aneo4j-labs+bnode%3A%2F%2F&type=code

OR13 avatar Jun 20 '22 17:06 OR13

A simpler one liner to reproduce the issue (beware it deletes everything, so don't run this outside of a new database):

MATCH (n)
DETACH DELETE n;

DROP CONSTRAINT ON (r:Resource)
ASSERT r.uri IS UNIQUE;

CALL n10s.graphconfig.init( { handleVocabUris: 'MAP', handleRDFTypes: 'NODES' });

CREATE CONSTRAINT n10s_unique_uri ON (r:Resource)
ASSERT r.uri IS UNIQUE;

CALL n10s.rdf.import.inline(
'
<https://api.did.actor/revocation-lists/1.json#0> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/vc-revocation-list-2020#RevocationList2020Status> .
<https://api.did.actor/revocation-lists/1.json#0> <https://w3id.org/vc-revocation-list-2020#revocationListCredential> <https://api.did.actor/revocation-lists/1.json> .
<https://api.did.actor/revocation-lists/1.json#0> <https://w3id.org/vc-revocation-list-2020#revocationListIndex> "0"^^<http://www.w3.org/2001/XMLSchema#integer> .
<urn:uuid:37a64932-49cf-4afd-8c5e-ced22f87d835> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://www.w3.org/2018/credentials#VerifiableCredential> .
<urn:uuid:37a64932-49cf-4afd-8c5e-ced22f87d835> <https://w3id.org/security#proof> _:c14n1 .
<urn:uuid:37a64932-49cf-4afd-8c5e-ced22f87d835> <https://www.w3.org/2018/credentials#credentialStatus> <https://api.did.actor/revocation-lists/1.json#0> .
<urn:uuid:37a64932-49cf-4afd-8c5e-ced22f87d835> <https://www.w3.org/2018/credentials#credentialSubject> <did:example:123> .
<urn:uuid:37a64932-49cf-4afd-8c5e-ced22f87d835> <https://www.w3.org/2018/credentials#issuanceDate> "2010-01-01T19:23:24Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> .
<urn:uuid:37a64932-49cf-4afd-8c5e-ced22f87d835> <https://www.w3.org/2018/credentials#issuer> <did:key:z6MktiSzqF9kqwdU8VkdBKx56EYzXfpgnNPUAGznpicNiWfn> .
_:c14n0 <http://purl.org/dc/terms/created> "2022-06-20T16:52:58Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> _:c14n1 .
_:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/security#Ed25519Signature2018> _:c14n1 .
_:c14n0 <https://w3id.org/security#jws> "eyJhbGciOiJFZERTQSIsImI2NCI6ZmFsc2UsImNyaXQiOlsiYjY0Il19..jqpGjbIt1Hr9M5kZNzyPiTGxwm_tf2VqZiFvxIEgW31ryFyhOb_7muNwXEAzBmtL68UUQcB_dGUVfY9z978nAw" _:c14n1 .
_:c14n0 <https://w3id.org/security#proofPurpose> <https://w3id.org/security#assertionMethod> _:c14n1 .
_:c14n0 <https://w3id.org/security#verificationMethod> <did:key:z6MktiSzqF9kqwdU8VkdBKx56EYzXfpgnNPUAGznpicNiWfn#z6MktiSzqF9kqwdU8VkdBKx56EYzXfpgnNPUAGznpicNiWfn> _:c14n1 .

', 'N-Quads')

Then view the data with:

MATCH (n) RETURN n LIMIT 25
Screen Shot 2022-06-20 at 12 36 07 PM

OR13 avatar Jun 20 '22 17:06 OR13

Here is a snippet of CQL that adds a link relationship between the proof node and "similar blank nodes"...

This is an incredibly expensive hacky work around:

MATCH
    (n0: Resource),
    (n1: Resource),
    (n2: Resource)
WHERE
    (n0)-[:proof]->(n1) AND
    apoc.text.levenshteinSimilarity(n1.uri, n2.uri) > .8 AND
    apoc.text.levenshteinSimilarity(n1.uri, n2.uri) < 1
MERGE (n1)-[link: DATA_INTEGRITY_PROOF]->(n2) 
RETURN n0, n1, n2
Screen Shot 2022-06-20 at 1 10 33 PM

After this link has been added the graphs are connected.

Screen Shot 2022-06-20 at 1 08 52 PM

OR13 avatar Jun 20 '22 18:06 OR13

@VladimirAlexiev I had the same issue with JSON-LD v1.1 before... Its a major reason to convert from the standard JSON representation of a credential to the n-quad or framed versions... which seem to be better supported by graph databases.

I suppose the next step should be to create 3 or 4 VCs and import them all, and then look at the graph again.

I would expect to be able to see that they are "proofs for the same information", but from different actors, over time.

OR13 avatar Jun 20 '22 18:06 OR13

A much smarter way to join the graphs after import:

MATCH
    (n1: Resource),
    (n2: Resource)
WHERE
    split(n1.uri, '-')[1] = split(n2.uri, '-')[1] AND
    NOT EXISTS(n1.jws) AND
    EXISTS(n2.jws)
MERGE (n1)-[link: DATA_INTEGRITY_PROOF]->(n2) 
RETURN n1, n2

^ this doesn't work though because of the way the blank node identifiers are assigned during a bulk import...

Screen Shot 2022-06-20 at 2 57 12 PM

In this case, 3 credentials are imported, but each has a proof with a blank node id that looks like:

uri: bnode://genid-16ff0ebe17c448c0b1db6d23018428c4-b10
uri: bnode://genid-16ff0ebe17c448c0b1db6d23018428c4-b11
uri: bnode://genid-16ff0ebe17c448c0b1db6d23018428c4-b9

... because they were imported at the same time.... even though the credentials were issued at different times.

On the other side of the gap, we have:

uri: bnode://genid-16ff0ebe17c448c0b1db6d23018428c4-b5
uri: bnode://genid-16ff0ebe17c448c0b1db6d23018428c4-b8
uri: bnode://genid-16ff0ebe17c448c0b1db6d23018428c4-b0

After import, we can tell they are all related by looking at 16ff0ebe17c448c0b1db6d23018428c4... but we can't tell which ones are related because of the way the information is handled when multiple credentials (each with a proof that is a @container) are handled at once.

A few thoughts:

  1. stop trying to import RDF directly, and instead transform it before importing.
  2. import RDF, but only 1 credential / object at a time... so that any blank nodes get a useful unique id.

My goal:

  1. minimize any data transformations between RDF and LPGs
  2. import VC / VP over time
  3. Import as much data as fast as possible

it seems the naive solutions to this problem are causing me to trade 1 goal for another.

OR13 avatar Jun 20 '22 19:06 OR13

Importing objects that might contain blank nodes 1 at a time seems to work:

Screen Shot 2022-06-20 at 3 20 56 PM

Left hand side:

uri: bnode://genid-d10239de14ab4697baa44fdef3190c14-b3
uri: bnode://genid-4eb97b93909d41a19febb7483c8e49eb-b3
uri: bnode://genid-a5218ac4e96f433c8d31bb6a1115c49a-b3

Right hand side:

uri: bnode://genid-d10239de14ab4697baa44fdef3190c14-b0
uri: bnode://genid-4eb97b93909d41a19febb7483c8e49eb-b0
uri: bnode://genid-a5218ac4e96f433c8d31bb6a1115c49a-b0

It's now possible to join by looking at the middle component of the uri.

MATCH
    (credential: Resource),
    (signature: Resource)
WHERE 
    ()-[:proof]->(credential) AND
    EXISTS(signature.jws) AND
    split(credential.uri, '-')[1] = split(signature.uri, '-')[1]
MERGE (credential)-[link: DATA_INTEGRITY_PROOF]->(signature) 
RETURN credential, signature, link
Screen Shot 2022-06-20 at 3 35 49 PM

After this relationship is added:

Screen Shot 2022-06-20 at 3 36 57 PM

OR13 avatar Jun 20 '22 20:06 OR13

Unfortunately, this won't help you with Verifiable Presentations...

Because the proofs on the credentials will have a similar blank node identifier as the proof on the presentation:

Screen Shot 2022-06-20 at 4 01 55 PM

Left:

uri: bnode://genid-83dec2dceeea4792a549afec00991790-b10
uri: bnode://genid-83dec2dceeea4792a549afec00991790-b11
uri: bnode://genid-83dec2dceeea4792a549afec00991790-b12
uri: bnode://genid-83dec2dceeea4792a549afec00991790-b14
uri: bnode://genid-83dec2dceeea4792a549afec00991790-b13

Right:

uri: bnode://genid-83dec2dceeea4792a549afec00991790-b1
uri: bnode://genid-83dec2dceeea4792a549afec00991790-b4
uri: bnode://genid-83dec2dceeea4792a549afec00991790-b7

Same problem as before.

The problem here is worse though... Since we also have the dangling @container from the verifiableCredential relationship:

"holder": {"@id": "cred:holder", "@type": "@id"},
"proof": {"@id": "sec:proof", "@type": "@id", "@container": "@graph"},
"verifiableCredential": {"@id": "cred:verifiableCredential", "@type": "@id", "@container": "@graph"}
  • https://github.com/w3c/vc-data-model/blob/v1.1/contexts/credentials/v1#L81

I'm less sure how to fix this since:

  1. id is not required on VCs or VPs.
  2. @container is on the VC.proof and the VP.proof AND the VP.verifiableCredential relationships.

It should be possible to import the credentials individually, then the presentation, and then define relationships between them... but having to do that for every VP is going to add a LOT of overhead.

... it does work...

Screen Shot 2022-06-20 at 4 29 05 PM

After importing each item 1 at a time... the graphs for a VP can be joined:

Screen Shot 2022-06-20 at 4 31 21 PM

But I lost the vp.verifiableCredential container along the way... assuming you are lucking enough to always have an id for both VC and VP, this can be fixed at the end with:

MATCH 
    (vp { uri: 'urn:uuid:7ea1be55-fe46-443e-a0ce-eb5e40f47aaa' }),
    (vc { uri: 'urn:uuid:a96c9e16-adc3-48c7-8746-0e1b8c3535ba' })
MERGE 
    (vp)-[link: PRESENTED]->(vc) 
RETURN vc, vp, link
Screen Shot 2022-06-20 at 4 40 10 PM

OR13 avatar Jun 20 '22 20:06 OR13

Blank nodes are extremely useful, just like other forms of pronoun. However, they are not appropriate for use in all cases; sometimes, a proper noun (a/k/a a URI, URN, IRI, such as a DID) is more appropriate. I submit that these are such cases.

TallTed avatar Jun 20 '22 21:06 TallTed

I added a similar import for VC-JWTs here https://github.com/transmute-industries/verifiable-data/pull/198

This raises interesting questions, since VC-JWT has an external proof... there is nothing to import regarding the proof semantics (without me making some custom mapping to import properties from the JWT header).

I can see benefits to both approaches... but its interesting to not that by default both LD Proofs and VC-JWT don't import the proof as connected to the credential.

OR13 avatar Jun 27 '22 13:06 OR13

The issue was discussed in a meeting on 2022-08-03

  • no resolutions were taken
View the transcript

6.7. proof in @context and the use of @container (issue vc-data-model#881)

See github issue vc-data-model#881.

Manu Sporny: I think this is in the core data model - or at least in the core context..
… We could move it out in the future, but for now should stay..

Brent Zundel: Anyone opposed to that...?.
… Taking label off..

iherman avatar Aug 04 '22 04:08 iherman