ro-crate icon indicating copy to clipboard operation
ro-crate copied to clipboard

Proposal: Make v2 spec smaller and tighter and concentrate on named rules (with a code) of syntax and semantics of the RO-Crate-Metadata document from fundamentals

Open ptsefton opened this issue 1 year ago • 5 comments

The current RO-Crate spec is a mixture of crate-making guide and semi-formal specification. I propose that we still provide an introduction that shows RO-Crate in use, but that the formal specification is devoted to rules and definitions and it presented in an order that reflects the steps needed to write RO-Crate software, that is, start with the fact that it needs to be JSON, that it needs a @context and a @graph and that the @graph is an array of entities with certain characteristics and build from there (see my proposed clarification on types of crate, that separates "abstract" RO-Crate documents from the packaging function where there are rules about resolving data entities).

Proposal:

  1. The spec is presented as a set of Rules in the order that a software developer would need to follow to write RO-Crate software in a new language
  2. Each Rule has a Code, so that errors and warnings can be presented to users with a link back to the spec
  3. At least the two main current implementations (RO-Crate-py and RO-Crate-JS) will have new versions that have documentation that we can link to in the spec

(The idea that we might be able to encode some of the rules in a language like LinkML, JSON-Schema or SHACL is attractive, and this MAY be an option for defining the spec, but AFAIK it is not possible to do this without wrapping the schema language in procedural code as seen in ro-crate-validator -- this seems too cumbersome to be the basis for a spec definition, but we could definitely make sure each part of a validator is labelled with codes that related back to the spec - see 2. above)

ptsefton avatar Jan 06 '25 23:01 ptsefton

I have started playing with these ideas on my github - these are NOT firm proposals, but intended to start a conversation, see what ideas other people have: https://github.com/ptsefton/ro-crate-2-experiments/blob/main/ro-crate-2-spec.md

@elichad -- as discussed in Slack

ptsefton avatar Jan 06 '25 23:01 ptsefton

Adding to this: we have already discussed splitting out the parts of the spec that can be seen as "patterns" for specific types of entity into a separate place (either a different section of the spec to the technical rules, or a different section of the website altogether). I think this would make both elements of the documentation much more usable.

This aligns quite nicely with the Diátaxis framework's distinction between "reference" and "how-to guides". I think that framework is good to consider as we rethink how we provide our documentation.

Thinking about that raises a question for me: knowing that v2 will probably take a long time to develop and release (unless we are very strict about what goes into it), could we consider reworking our spec/documentation structure with just the v1.2 material before we expand it for v2? i.e. no changes to the technical requirements or recommendations, just changing the way they are organized in writing.

elichad avatar Jan 07 '25 14:01 elichad

The Diátaxis stuff on "reference" looks OK to me, though a spec is a special kind of reference they don't seem to cover, less sure about their definitions of How to guide vs tutorial, but we can discuss that.

My initial reaction to a rewrite of 1.2 is that it is likely to take a long time, and our effort is better spent on 2.0 at this point. I do think that having started playing with it that if we keep it to a small core with profiles and guides to cover a lot of what is in 1.x we should be able to get it done in 2025.

ptsefton avatar Jan 07 '25 21:01 ptsefton

Agreed in steering committee 2025-09-05

stain avatar Sep 05 '25 08:09 stain

Late reaction: I like the spec-as-an-implementing-developer-guide idea. Clean target audience surely serves as a focal point, allowing to cut (move out) text to separate docs.

Still, rereading after some time makes me see there could still be different angles to such "implementation". Many distinct objectives for those exist: reading, writing, viewing content, validating .. So we might want to either select for one, either prioritize the attention, or just clearly mark sections as targeting one of these distinct implementation objectives?

mpo-vliz avatar Oct 15 '25 08:10 mpo-vliz