ro-crate
ro-crate copied to clipboard
How to reference and retrieve another RO-Crate
This PR fixes #228 #160
Generalizes the Content-negotiate-or-signposting section from not just Profile Crates.
For ZIP files this is still vague in that it says If the retrieved resource is a ZIP file (Content-Type: application/zip), then extract ro-crate-metadata.json, or, if the archive root only contains a single folder (e.g. folder1/), extract folder1/ro-crate-metadata.json
I've also added BagIt reference as this would be a second folder, e.g. folder1/data/ro-crate-metadata.json and then the checksums should be verified first as we do in https://trefx.uk/5s-crate/0.4/#check-phase
As for referencing another RO-rate from another, either the referenced RO-Crate can have its own distribution with a conformsTo:
{
"@id": "./",
"@type": "Dataset",
"identifier": "https://doi.org/10.48546/workflowhub.workflow.775.1",
"url": "https://workflowhub.eu/workflows/775/ro_crate?version=1",
"name": "Research Object Crate for Jupyter Notebook Molecular Structure Checking",
"distribution": {"@id": "https://workflowhub.eu/workflows/775/ro_crate?version=1"},
"…": ""
},
{
"@id": "https://workflowhub.eu/workflows/775/ro_crate?version=1",
"@type": "DataDownload",
"encodingFormat": ["application/zip", {"@id": "https://www.nationalarchives.gov.uk/PRONOM/x-fmt/263"}],
"conformsTo": { "@id": "https://w3id.org/ro/crate" }
}
or it can have a subjectOf to a ro-crate-metadata.json:
{
"@id": "http://example.com/another-crate/",
"@type": "Dataset",
"conformsTo": { "@id": "https://w3id.org/ro/crate" },
"subjectOf": { "@id": "http://example.com/another-crate/ro-crate-metadata.json" }
},
{
"@id": "http://example.com/another-crate/ro-crate-metadata.json",
"@type": "CreativeWork",
"encodingFormat": "application/ld+json"
}
As used by the 5s-crate profile: https://trefx.uk/5s-crate/0.4/#referencing-a-workflow-crate
Could @dgarijo or @ptsefton have a look at this? I've used it here: https://stain.github.io/workflow-run-crate/profiles/0.5-DRAFT/process_run_crate/ro-crate-preview.html#https%3A//www.researchobject.org/workflow-run-crate-paper/mapping/
Perhaps we should add that isPartOf pattern as well on how to mention a file within another crate? (Could get tricky to make absolute URIs..)
Will do when I finish the ISWC reviews that are due tomorrow :(((
El lun., 20 may. 2024 9:14 p. m., Stian Soiland-Reyes < @.***> escribió:
Could @dgarijo https://github.com/dgarijo or @ptsefton https://github.com/ptsefton have a look at this? I've used it here:
https://stain.github.io/workflow-run-crate/profiles/0.5-DRAFT/process_run_crate/ro-crate-preview.html#https%3A//www.researchobject.org/workflow-run-crate-paper/mapping/
Perhaps we should add that isPartOf pattern as well on how to mention a file within another crate? (Could get tricky to make absolute URIs..)
— Reply to this email directly, view it on GitHub https://github.com/ResearchObject/ro-crate/pull/296#issuecomment-2121046774, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALTIGT5ZN6HYRP3WSEQEI3ZDJDQLAVCNFSM6AAAAABGZZMP2CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRRGA2DMNZXGQ . You are receiving this because you were mentioned.Message ID: @.***>
Thanks @stain , I have had a look. The only thing that it is not fully clear to me is where the distribution information is supposed to be added: is it on the crate referencing the other crate, or in the referenced crate metadata?
For example, let's say crate A references crate B. Usually I would add a link in A to B. But here you recommend adding also where B is stored, correct? As opposed to adding a link to B, and hoping that when I resolve the id I get a JSON-LD with the distribution information.
The only potential issue I see is that distributions may not have persistent ids. If the link from A to B persists, but the distribution is hosted elsewhere in the meantime, B has no means to tell A about this. But I am ok with this limitation
Think this is good to go in now, waiting final approval from @dgarijo or @jmfernandez
I can't seem to resolve the requested changes from @dgarijo although I have addressed them in earlier commits.
Sorry @stain I am not available right now for review. Will do as soon as I can.
After meeting 2024-08-22, revised to not require subjectOf declaration if the RO-Crate metadata file can be resolved. This meant adding a section on how the absolute URI of the RO-Crate can be determined if there is no identifier
I now feel this should be split out into a page separate from data-entities.md
Once again, apologies. Will review by the end of the week.
Thanks everyone, will merge now after fixing latest round of typos! Then we see if we need to move it out.