nf-prov
nf-prov copied to clipboard
Review BCO missing fields
Summary of the BCO fields that are missing or incomplete:
- provenance_domain
- review
- derived_from
- obsolete_after
- embargo
- contributors (affiliation, email, orcid)
- license
- usability_domain
- description_domain
- keywords
- xref
- pipeline_steps (version, prerequisite)
- execution_domain
- external_data_endpoints
- environment_variables
- error_domain
- empirical_error
- algorithmic_error
See the BCO User Guide for descriptions of these fields.
-
Some fields like
usability_domain
are free text and seemingly meant to be completed manually. -
Some fields like
review
andobsolete_after
might be automated by a larger system that can launch pipelines and has the requisite knowledge. nf-prov could act as a pass-through by accepting these fields as config settings. -
Some fields like
license
andkeywords
could probably just be added to the Nextflowmanifest
config scope -
Some fields like
version
andprerequisite
could probably be added but might not be worth the effort. For example these fields for tool metadata are implicitly described by the pipeline repo + commit hash, so they aren't really needed long as the git hash is provided.
The BCO manifest produced by nf-prov should always be "valid" against the JSON schema even if it isn't complete. Some missing fields are present but empty. At the end of the day, the user can add any missing details by hand, but it might be better to provide some pass-through config settings so that those manual edits are tracked e.g. in the run history of their workflow platform.
Anyway, just wanted to put this analysis here as a reference for anyone who wants to improve the BCO manifest.