ids-specification
ids-specification copied to clipboard
RFC: Introduce a 'Policy Repository'
I want to kindly ask to give feedback regarding the following proposal:
Example snippet from the catalog example:
"odrl:hasPolicy": [
{
"@context": {
"@vocab": "https://www.w3.org/TR/odrl-model/"
},
"@id": "urn:uuid:2828282:3dd1add8-4d2d-569e-d634-8394a8836a88",
"permission": [
{
"action": "use",
"constraint": [
{
"leftOperand": {
"@value": "spatial"
},
"rightOperand": {
"@value": "EU"
},
"operator": "EQ"
}
],
"duty": []
}
],
"prohibition": [],
"obligation": []
}
],
https://raw.githubusercontent.com/International-Data-Spaces-Association/ids-specification/71f06f718147f12a4d333e9a9e604d13944882b1/catalog/message/catalog.json
Since this is very redundant, because many of those policies / offers are very similar, I think it would be also allowed to use IDs as a reference instead of the entire node of the object, meaning:
"odrl:hasPolicy": [
{
"@type": "odrl:Offer",
"@id": "https://provider.com/edc/offer/1",
},
{// next possible policy }
],
At least that is my - still limited - understanding of: https://www.w3.org/TR/vocab-dcat-3/#conformance
Additional constraints in a profile MAY include: Controlled vocabularies or IRI sets as acceptable values for properties
and if this is the case, I would even slightly change the id of the policy and make it a "content addressable storage" by using the hash of the policy itself (as a URL) to reference it. So it would look like:
"odrl:hasPolicy": [
{
"@type": "odrl:Offer",
"@id": "https://provider.com/policies/cdfd26aaf5b1fdc6d71af7c1349869f9314b67626bc1eec44e64af674e357eed",
},
{// next possible policy }
],
where the cdfd26aaf5b1fdc6d71af7c1349869f9314b67626bc1eec44e64af674e357eed
is a sha256 hash of the policy itself. Serialization / canonicalization details need to apply of course.
That means there would be a new endpoint containing all possible policies: a policy repository
under /policies/{hash}
The only way that this is possible at all, is because of the separate profile which says: https://github.com/International-Data-Spaces-Association/ids-specification/blob/main/catalog/catalog.protocol.md#5-dcat-and-odrl-profiles
Each ODRL Offer must NOT include an explicit target attribute.
because hashing the policy WITH a target
wouldn't work :-)
The advantages would be:
- Reduce data transfer, assuming many datasets use the same policy
- Reduce processing overhead (on both sides, but mainly) on the consumer side, because it's immediately clear whether the policy is already known and can be accepted (allow-listed policies...)
- uniqueness of policies may allow some further optimizations during the flow
Possible DISadvantages:
- Are policies protected content already? And is the hash 'unique' enough to 'protect' its content? Further access control could be applied, too, but probably is not worth it. I guess it's better to add the full policy to the dataset instead of a reference if there are concerns. Can be decided per dataset of course.
Any thoughts on this?
https://www.w3.org/TR/odrl-model/#policy-has
Summary of the discussion in the weekly meeting:
- jsonld allows this
- dataspace should define whether such document resolutions must be allowed. This could allow relative URLs and potentially allow-listed dataspace specific repositories. Resolving any 3rd party reference is considered a security issue and should not be done.
- the hash identifier was considered a good idea to uniquely identify a policy
- the transfer size is not considered an issue, because the response could be gzipped content
- we'll add a 'Note' box to the spec to make it clear to the reader that this is a dataspace decision.
Thanks for the discussion during the meeting, Matthias Binzer
This discussion in ODRL might be also relevant: https://github.com/w3c/odrl/issues/12
Adding my two (three) cents:
the transfer size is not considered an issue, because the response could be gzipped content
Should be something to consider in my view as the (http) protocol binding should make an either-or decision:
- either the content is plain JSON (then the size is a factor), or
- the content shall be compressed, or
- both is possible (then both clients and servers need to implement functions for both)
For now, option 1 is described, therefore the size of the body can be a factor.
the hash identifier was considered a good idea to uniquely identify a policy
Putting additional information (like type declarations or other things like a content hash) is, if I remember correctly, regarded as a not-so-good pattern in the RDF/Linked Data world. I don't remember all details but most likely it boils down that the referenced JSON document usually (always?) needs to contain the identifier itself:
{
"@context": {
"@vocab": "https://www.w3.org/TR/odrl-model/"
},
"@id": "https://provider.com/policies/cdfd26aaf5b1fdc6d71af7c1349869f9314b67626bc1eec44e64af674e357eed",
"permission": [
...
]
}
In that case, "cdfd26aaf5b1fdc6d71af7c1349869f9314b67626bc1eec44e64af674e357eed" cannot be the hash of the JSON document as it would need to contain itself...
dataspace should define whether such document resolutions must be allowed.
I'd like to see it in the schemas, either "only expect '@id' here" (reference case) or "expect a full odrl:Offer object here" to reduce the degree of freedom for the individual implementations / increase their interoperability. But this makes it more complicated for different data spaces as they would need deviating schema files, manage their versions accordingly, ...
After some discussion over the last weeks regarding json-ld and how arrays and timestamps are represented (#139 and #125) and the resulting new jsonld context file proposed here: https://github.com/International-Data-Spaces-Association/ids-specification/issues/132#issuecomment-1658376829 I tried to think about the consequences of such changes and cam back to this issue here.
Luckily, @mkollenstart could also spend some time and we had some deeper discussions on the matter and came up with potential options to go forward with:
Option 1:
Json-LD anyway allows remote / referenced documents and we do NOT explicitly allow or disallow this in the text. Proposal is, that we explicitly allow this for hasPolicy
(entire Policy) or a level deeper, the Rules inside a policy under e.g. "permission": [
Currently we don't see a way to express this in a Json-LD context directly, so I think it should be described in the text.
Option 2:
Make the referenced documents for hasPolicy
and/or Rules
the default. In many cases the information is very repetitive anyway.
This would require 2 additional endpoints in the catalog interface /policies/<id>
and /rules/<id>
An advantage of Option 2 is, that id
could be any, also non-resolvable id, e.g. a uuid
and the endpoint would require the same auth mechanisms as the regular catalog interface.
Also, the references can be http://
identifiers. In such cases, the biggest question is how to deal with authentication at such endpoints. Probably the easiest approach was to say such http references should be publicly available endpoints like schema.org and others. A consumer always might decide NOT to fetch from unknown endpoints!
I think a dataspace might also define such a policy and rules repository and also might define separate authentication mechanisms for it, e.g. checking a dataspace Membership Credential. But I think this is out of scope for the DSP spec itself.
Any further thoughts on this? Let's discuss this tomorrow in our weekly meeting.
--
Matthias Binzer
Having this would remove a lot of constraints that the DSP places on the usage of Linked Data. I'm unsure having json-schemas in addition to shacl shapes would even be feasible in that case.
Also, there hasn't been any progress on this in ten months, so I suggest to close this ticket.
I would consider this as a potential optimization of the DSP and we just didn't work on this to get the initial version 0.8 released. How to deal with such open topics? We should not just close it.
I'm opposed to increased flexibility in the protocol's message payloads. But yea, perhaps there should be a structured WG decision on this. @ssteinbuss - WDYT?
Work on DSP is now moved to Eclipse. We should raise new issues in the Eclipse WG.
That was discussed in our last call on Thursday. We will assess each issue in this repo and decide which to move to the Eclipse Project. I doubt that we should bring each issue of the project to the Working Group level.