Entry Points for RO-Crate Profiles
In the current specification, profiles are always defined on a complete RO-Crate. In case that such profile specifies requirements on the root data entity, they cannot be used in a modular or composable way, since subdirectories in RO-Crates don't specify a root data entity. An example for such a profile is the Workflow-Run-RO-Crate.
This issue came up at the BioHackathon Europe 2024, project 19 (@elichad @dnlbauer @floWetzels). Since composed research objects seem to be common and it should be possible to model them in RO-Crates without redundance in profiles, we propose the introduction of a Entrypoint mechanic into the RO-Crate specification, see https://github.com/dnlbauer/bh24-ro-crate-extension for details (will be fleshed out in the future).
Subscribe to this issue to stay updated on the development.
To consider:
- can/should this work recursively? Having an entry point inside another entry point?
- what should happen if multiple entry points conform to the same profile? e.g. in the case of uploading a crate with multiple workflow entry points to WorkflowHub
- what should happen if multiple entry points conform to the same profile? e.g. in the case of uploading a crate with multiple workflow entry points to WorkflowHub
In this case, the service could decide to decompose the RO-Crate into "atomic" (for the lack of a better term) RO-Crates which can be handled like normal RO-Crates. I.e. WorkflowHub could decide to create multiple entries - one for each entry point in the crate; a workflow execution engine executing a workflow from RO-Crate could present a drop down selection to the user or require to specify an entrypoint during workflow submisson.
Intending to discuss this issue at the RO-Crate community call at 8:00 UTC tomorrow
Missed the meeting and thus the discussion in the call. Still sharing my two cents.
From a semantic point of view there is nothing preventing you to declare additional triples with the dcterms:conformsTo-predicate attached to any available part (subject) in the graph. And I don't think the ro-crate spec is formulating any restriction on that either. If anything, the jsonld-context just makes it handy to use conformsTo: keys in the json-ld to add these.
Its value is expected to contain an identifier (URI) for a standard, that
- just conceptually represents a number of assumptions clients can make about the subject
- allows those clients to verify if they have the knowledge on board to deal with that
As such one could be using the conformsTo in ro-crates in combination with
- data entities of type File to express e.g. the file is not just a netcdf file but conforming to th cf-conventions, or even a CSV file that sticks to some layout or schema, ...
- conceptual entities describing dataservices that e.g. conform to some webserrvice api standard (like ogc-wms, erddap, ...)
This way of applying dcterms:conformsTo exists outside the RO-Crate concept and can be applied to any part of it as far as I see. The fact that the RO-Crate specification additionally introduced some specific suggestions to express conformity of ro-crates was considered as a useful and clear mechanism to guide people into some kind of "duck-type" declaring of valid assumptions on the crate contents. The fact RO-Crate 1.2 introduces some guidance on this level
- does not in any way limit other usage of this mechanism (including the suggested nesting)
- nor should it raise the expectation that because of that the RO-Crate specification suddenly needs to control, document, or worse: forbid any more nested/detailed application of that same mechanism.
If anything, IMHO the RO-Crate spec should state it does deliberately not want to interfere with that detail level. And
Summarising discussion from community call 2024-11-14:
- Using the
EntryPointtype is not strictly required since it could be inferred from checking if anyconformsToon an entity is an RO-Crate profile. However it is convenient (for tooling) to make entry points explicit within the crate metadata. - Using
@typeto indicate entry points may not be the best choice, as@typeusually describes what the actual thing represented by the entity is (e.g. a File, a Person, a Place), and an entry point is just a construct in the metadata- the existing
EntryPointtype in schema.org is intended for describing API endpoints and such, this isn't quite the same as our idea, we wouldn't want someone to describe an API using RO-Crate and end up with weird conflicts because of the overloaded type - we could find a different property to use (but we haven't found a good one so far)
- in our example we also included the entry points under "about" in the metadata entity, which is another way to make them easily discoverable by tools
- ISA profile uses
additionalTypeto indicate Investigation, Study, Assay https://github.com/nfdi4plants/isa-ro-crate-profile/blob/release/profile/isa_ro_crate.md - potentially do something similar to indicate entry points? - Do we need a “Crate” type?
- the existing
- Highlighting of entrypoints GUI wise - they are possible views of the crate
- Profiles often talk about the root crate - the entrypoint would be a mechanic for profiles to talk about their root without necessarily that being the RO-Crate Root
- Want to make RO-Crate profiles more like mix-ins that don’t require being in an RO-Crate
- https://www.researchobject.org/ro-crate/specification/1.2-DRAFT/data-entities.html#referencing-other-ro-crates uses conformsto and type Dataset to indicate an external crate. But what if it only conforms to a profile without being a RO-Crate?
- RO-Crate V2 is moving towards "fragments". Also does not have to be a Dataset that is the “root”. Increasingly profiles of fragments can then be used. This idea fits well with that vision
- If we add EntryPoint there may be only be certain properties that we should follow recursively to scope the “sub-crate” e.g. hasPart, mentions, mainEntity. But what links NOT to follow?
- Alternative of using named
@graph{ fragments } to isolate the scope? Can get quite complicated.
We'll revisit this after the release of v1.2 when we discuss what will be included in v2 (February at the earliest, I think)
This would be useful for things like CSVW schemas describing tabular Data Entities
If a 'sub profile' is required to work on a defined collection of elements then there are a number of mechanisms for creating these collections by compiling them into a list, eg https://schema.org/ItemList abd a profile could then only validate the list items
agreed in steering committee 2025-09-04:
- Profile conformance can be applied at any level of the RO-Crate, not just at root.
- RO-Crate 1.2 implementation note to highlight how a profile can allow itself to be applied not just at root level