BridgeDb icon indicating copy to clipboard operation
BridgeDb copied to clipboard

Have the QC tool report on metadata alignment with DataCite

Open egonw opened this issue 9 months ago • 1 comments

Right now, Derby files often report something like this:

DATASOURCENAME	Ensembl
BUILDDATE	20230311
SERIES	Homo sapiens genes and proteins
DATATYPE	GeneProduct
DATASOURCEVERSION	108
SCHEMAVERSION	3

This needs to be extended to comply with the DataCite standard.

These should be added:

  • DATASOURCEID: should list a DOI (https://datacite-metadata-schema.readthedocs.io/en/4.5/properties/identifier/, mandatory, not URL)
  • CREATOR: listing the creator name. The creator here refers to the Derby file (https://datacite-metadata-schema.readthedocs.io/en/4.5/properties/creator/, mandatory)
  • CREATORORCID: ORCID of the creator ((https://datacite-metadata-schema.readthedocs.io/en/4.5/properties/creator/, mandatory)
  • PUBLISHER: likely Figshare or Zenodo
  • LICENSE: the license of this data, e.g. "CC-BY 4.0 International" (https://datacite-metadata-schema.readthedocs.io/en/4.5/properties/rights/)

Other items we should add:

  • WEBSITE: pointing to a webpage with more information on where more info on this Derby file can be found

egonw avatar May 16 '24 08:05 egonw