ro-crate icon indicating copy to clipboard operation
ro-crate copied to clipboard

Use Case: Resolve RO-Crate from persistent identifier/DOI with landing page

Open stain opened this issue 4 years ago • 3 comments
trafficstars

As a type of user, I want some goal so that some reason.

As a potential programmatic consumer of RO-Crates I want to find the RO-Crate metadata file given a persistent identifier/DOI so that I can index/catalogue potential Crates.

As discussed in RO-Crate call 2021-07-08 and already explored in #154 this should be with a couple of options:

A persistent identifier for a creative work SHOULD for browsers give a human-readable representation like HTML. This MAY be equivalent to the RO-Crate Website, or a more specific rendering that just happens to have a corresponding RO-Crate.

To resolve a persistent identifier to a machine-readable JSON-LD, these approaches are recommended to retrieve its RO-Crate metadata file:

  1. HTTP Content-negotiation for the RO-Crate media type, for example:
    Requesting https://w3id.org/ro/profile/paradisec/0.1 with HTTP header
    Accept: application/ld+json;profile=https://w3id.org/ro/crate redirects to the RO-Crate Metadata file https://example.org/ro-profiles/paradisec-0.1.0/ro-crate-metadata.json
  2. FAIR Signposting in HTTP headers response to HEAD request, using rel="describedby" and the RO-Crate media type: Link: <https://example.org/workflows/29/ro-crate-metadata.json>; rel="describedby"; type="application/ld+json;profile=https://w3id.org/ro/crate"
  3. Parse the Landing Page HTML, looking for FAIR Signposting <link href="…" rel="describedby" type="…"> as above, or <script type="application/ld+json"> blocks embedding the metadata file as for RO-Crate Website. (Note: The <script> type do not include profile)
  4. The above approaches may fail, e.g. for content-delivery networks that do not support content-negotiation. One fallback, following the RO-Crate Structure, is to try resolving the path ./ro-crate-metadata.json from the resolved URI (after permalink redirects). For example:
    If permalink https://w3id.org/ro/profile/paradisec/0.1 redirects to https://example.org/ro-profiles/paradisec-0.1.0/, then get https://example.org/ro-profiles/paradisec-0.1.0/ro-crate-metadata.json

stain avatar Jul 08 '21 10:07 stain

Can we add some recommendations to the main spec with this? For instance, in this RO: https://w3id.org/dgarijo/ro/sepln2022 if you do: curl -sH "Accept:text/html" -L https://w3id.org/dgarijo/ro/sepln2022 you get an HTML representation, but if you do curl -sH "Accept:application/ld+json" -L https://w3id.org/dgarijo/ro/sepln2022 you will get the json-ld representation associated with it.

dgarijo avatar Jan 12 '23 10:01 dgarijo

Partially described in How to retrieve a Profile Crate but this should be generalized beyond Profile Crates and also cover more reliable signposting.

stain avatar Apr 27 '23 16:04 stain

@ptsefton to check how we implemented this in a recent pull request.

stain avatar Feb 08 '24 08:02 stain

Fixed by #296 under review.

stain avatar May 23 '24 21:05 stain