odrl icon indicating copy to clipboard operation
odrl copied to clipboard

policies for creators/publishers

Open joepvgenuchten opened this issue 2 months ago • 16 comments

As a data architect and/or someone with data governance roles, I would like to use ODRL to express policies to which data publishers should comply.

As an example, I would like to be able to say that a publisher of a dataset needs to ensure/promise its users, that it will comply to https://www.go-fair.org/fair-principles/f1-meta-data-assigned-globally-unique-persistent-identifiers/

I think this can be solved by adding a 3rd top-level action (next to use, and transfer) that is called something like, 'create', 'produce' or 'publish'.

I imagine that in this case the assignee is the creator/producer/publisher of the dataset, and the assigner is either the producer/publisher themselves (declaring a policy that they self impose) , some kind of data governance role, or actually a user/consumer role that places certain kinds of requirements on the data(set/product) they consume.

PS: if this is a usecase that is already served in some other way. For instance, would the use action be appropriate in this scenario? please let me know

joepvgenuchten avatar Nov 04 '25 16:11 joepvgenuchten

We are proposing to expand ODRL and enable the use of Templates as this is an example of a much wider problem (see the linked issue).

In your case, since your data doesn't exist when the policy requirements are built, there is no way to say: uri_jeop_asset1 -[odrl:hasPolicy]-> uri_best_practices_policy_123 and there are no rules targeting the asset (as it doesn't exist, since the governance pre-dates the instance), so a template would work something like this:

  • object_class_jeop has policy templates (e.g. uri_best_practices_policy)
  • When an instance uri_jeop_asset1 of that class is created, we can instantiate a policy uri_best_practices_policy_123 and fill in the blanks (asset, assignee, and assigner).

In your ecosystem for data governance, basically the enforcement point (PEP) asks the decision point (PDP) some new "questions" (tbd) like "The assignee X is asking for access to an object", then the PDP can check if the assignee has a policy or if the class of that object has a -[hasMasterPolicy]-> relationship(s) and instantiate the templates by filling the blanks (this assumes some shared catalogue of templates).

joshcornejo avatar Nov 05 '25 11:11 joshcornejo

Thanks, this helps a lot,

although I disagree that uri_jeop_asset1 -[odrl:hasPolicy]-> uri_best_practices_policy_1 would not be meaningful. I can imagine 2 scenarios

  • Coming from the DCAT side of things, I would be able to create a dataset that has as status 'planned' or some thing like that. I could then express that the (planned) dataset will conform to the policy., I could even say that the status wont change to 'production' until the policy is conformed to
  • coming from a (dcat) data service or (dprod) data product perspective, my service/product will serve continuously changing/updating information, in this context, the policy would be more like a service-level agreement

The fact that the actual data doesn't exist yet (or is not serialized in a dcat:Distribution or adms:AssetDistribution) , doesn't (shouldn't) mean I cannot make metadata statements about it

joepvgenuchten avatar Nov 05 '25 12:11 joepvgenuchten

Coming from the DCAT side of things, I would be able to create a dataset that has as status 'planned'.

You already have a URI that won't change, the object already exists, you can point to a policy and the rules can target the asset, so you have no problem and the actions are not "shoulds/coulds" due to uncertainty, but "must" because you have defined them all?

joshcornejo avatar Nov 05 '25 12:11 joshcornejo

haha, my apologies, my could/should were not that formal in nature :) 'must' is indeed a better term here.

If using odrl policies for planned datasets is totally legitimate, I still have the remaining question surrounding the 'Action' class as each Policy MUST have a Rule and a Rule Must have an Action associated with it that is either one of the top-level actions or a specialization of them . Right now, the top level actions are 'use' and 'transfer'.

  • use - actions that involve general usage by parties.
  • transfer - actions that involve in the transfer of ownership to third parties.

Neither of these feel appropriate for a policy that a creator/publisher needs to comply to. The policy is not about the use by parties, and it's not about transfer of ownership, it is a policy that describes the agreed upon nature of the dataset itself for which the creator/publisher is responsible. reading the above definitions, a publication action doesn't seem (based on my interpretation of what is written) like a legitimate specialization of either of those (use or transfer).

So I guess my question would be: What kind of action would be appropriate for this usecase? and if not: can we have a new top level action to account for it?

joepvgenuchten avatar Nov 05 '25 14:11 joepvgenuchten

Your original example:

I would like to be able to say that a publisher of a dataset needs to ensure/promise its users, that it will comply to https://www.go-fair.org/fair-principles/f1-meta-data-assigned-globally-unique-persistent-identifiers/

Is about the shape of the asset (odrl asks for a uid, but that's left to the implementer) , there is no policy that can 'force' the creation of unique persistent attributes, you have to check the asset for that.

create and publish sound more like lifecycle items (also release) rather than rights to govern the asset (e.g. I can't force anyone to publish an asset).

The first sentence:

I would like to use ODRL to express policies to which data publishers should comply.

The "should comply" is where I thought templates would be your answer.

Finally, you can add your own profile with your own actions if needed:

joepNameSpace:joepAction1
	a :Action, skos:Concept ;
	rdfs:isDefinedBy joepNameSpace: ;
	rdfs:label "Joep's action expanse"@en ;
	skos:definition "Example Action."@en .

joshcornejo avatar Nov 05 '25 14:11 joshcornejo

Another option would be a policy that was an Obligation (assignee = asset owner) and the action could be ex:conform with a refinement of ex:conformanceRule = "URI"

riannella avatar Nov 05 '25 15:11 riannella

That obligation doesn't smell right, for examples the entities in DCAT already use dcterm:conformsTo, so not sure why the owner of an asset wouldn't conform to their own specification :)

joshcornejo avatar Nov 05 '25 15:11 joshcornejo

I would have no problem specifying my own action, but I still run into this part of the specification:

"An Action (except for use and transfer) MUST have one includedIn property value (of type Action) to transitively assert this Action that encompasses its operational semantics."

Also some responses to other comments/to give more context:

"there is no policy that can 'force' the creation of unique persistent attributes" I honestly do not see why not. When I file my taxes, my national tax revenue service requires me to put my SSN/personal number/etc on the form. When you sell a car, you need to add the license plate and/or vin to the ownership transferal form, etc etc. Also, to be clear. the persistent identifiers were just an example. I can also imagine a policy where a publisher of data is required to provide data lineage information (another FAIR requirement), or that a publisher of PII information complies to the GDPR and only makes that data available to those who have a legitimate use for it. or that the publisher is required to make data available within a certain time frame after an event occurred or something like that.

As for the dcat:conformsTo, that is indeed similar, but in my line of work, data sets are often published conform to standards without that being attached to a policy, but just out of convenience. for instance geospatial data that conforms to GML or something like that.

"so not sure why the owner of an asset wouldn't conform to their own specification", In my experience/environment, this happens all the time. Stuff, just happens and breaks down. This is exactly why service-level agreements/SLA's exist. And my understanding of ODRL is that it is aimed at modelling (amongst others) 'agreements'.

joepvgenuchten avatar Nov 05 '25 18:11 joepvgenuchten

I can understand why "use" would not really directly cover conformance, as it was intended as types of actions to perform with the asset.

Perhaps we do need a new top-level action, say "governance" which is all the management actions to perform on the asset...

riannella avatar Nov 06 '25 03:11 riannella

Re: @joepvgenuchten comment - you fill in the taxes on a pre-defined form that already has "conformsTo", you can't come up with your own template of a form, and the URN that you enter for yourself is validated by the consumer from the filled form, and no government official can file their taxes on an ad-hoc form, meaning that everyone has to "conformsTo" the same shape, mixing 2 topics? There is also the aspect of 'conformance validation' (this is discussed on D-PROD as well) where a SHACL is attached to make sure conformance is ensured.

And regarding the actual value of a unique identifier, that is not something a policy can guarantee?

Re: @riannella comment - If any digital type of stream can be considered an asset, then you are defining a new type of rule in the space of "Service Portfolio Management", rather than a new type of action under "Permission /Prohibition/ Obligation" ?

IMHO odrl:Obligation cannot be a catch-all for every type "responsibility", and hence why there was work defining ODRL-S for rules like "warranties, indemnities, liabilities" (which are also synonyms of 'obligation' but imply a different flow in the asset ecosystem?).

joshcornejo avatar Nov 06 '25 07:11 joshcornejo

@riannella 'governance' is a good angle. use and transfer are verbs so perhaps it would be good to come up with a verb for this idea as well. I was thinking 'audit' but it may be too narrow. 'manage' perhaps?

@joshcornejo I feel you are too much focused on the specific example of globally unique and persistent identifiers, so while i disagree with your objections to that specific example, i feel like diving in it further gets the discussion off track. my question here is about a more generic phenomenon, where producers/creators of data have constraints laid upon them that come out of agreements (and those can be more explicit or less explicit) made between creators/publishers of data and its users, or some other third party.

In that context, i do want to respond to this statement: "[...] that is not something a policy can guarantee?"

In general, I dont think a policy can guarantee anything. A law stating that you cannot drive faster than 50km/h does not guarantee that no one will break it. An AI organization policy stating that they will not use chat history to train their models does not guarantee that no one will build a system that does it anyway. Yet those policies are still useful as they provide a frame of reference to which we can say: rules/agreements/obligations/prohibitions were violated, now we can start to talk about consequences (whether in terms of remedies or perhaps even legal consequences). I always interpreted policies to be broader that something we can by definition automatically check (even though that's a great feature for those usecases wher it is possible) and more as an expression of an aspiration expressed between parties on how they want to interact (in this case with regard to data). I feel like that is not how you interpret it and so this me be the cause of our disagreement. If my interpretation of the meaning of Policy in the context of ODRL is incorrect, I would love for it to be refined.

joepvgenuchten avatar Nov 06 '25 07:11 joepvgenuchten

I agree with your general statement that in the real world, not all policies can be enforced. But a firewall or network routing policies are deterministic and can't be breached (unless there's a bug, but not the point). The fact that you can break the speed limit is due to the lack of "enforcement" services.

In a data marketplace, with data products and interfaces for access, if your policies are "decorative and can't be enforced", you have just a file-sharing system. What makes it a marketplace is the ability to "contract" and enforce those contracts (at its simplest: can you access the asset?), but also to make sure that the lifecycle can be practically met (lifecycle being a key element, you don't want chicken-and-egg scenarios or cross-boundaries between service access and service management).

In data governance, most of the work is on surfacing policy breaches and establishing new procedures for those breaches to be closed, but if all you want is to write them, is a bullet list sufficient? Why do you need a "machine-readable" definition that people still can bypass?

joshcornejo avatar Nov 06 '25 07:11 joshcornejo

Agreed. I am trying to implement this in a context where the maturity of the involved parties is not high enough and they do not have these fully automated market places yet. I have found odrl a very powerful conceptual model (and language) to start talking (with people in organizations to which all of this is very new) about policy and governance. Creating these machine readable artifacts, even if they are just decorative, is a first step on a journey that will hopefully look more like a data space or something like that.

joepvgenuchten avatar Nov 06 '25 08:11 joepvgenuchten

What are the next steps in this regard? It'd be happy to contribute if I can

joepvgenuchten avatar Nov 09 '25 10:11 joepvgenuchten

The options are:

  1. You define your own ODRL Profile - including a new top-level Action (eg manage).
  2. You accept that "use" (in the context of a dataset publisher) covers the management of the asset and then propose a new action (eg conforms) and constraint left operand (eg conformStandard) in the community vocab.

I think in a future ODRL version we would review the top-level actions and make them more expressive - but currently they are core Model semantics.

riannella avatar Nov 14 '25 03:11 riannella

I am late to the party, but this seems similar to issue #35. There, we argue that there is a need for ODRL actions to indicate that parties are/aren't allowed to create resources.

The concept create has been added to the community vocabulary by @besteves4

woutslabbinck avatar Nov 24 '25 16:11 woutslabbinck