oemetadata icon indicating copy to clipboard operation
oemetadata copied to clipboard

Metadata Standard needs to be more modular

Open Ludee opened this issue 1 year ago • 6 comments

Description of the issue

There are additional metadata requirements from energy related domains: MonitoringDB: Measurement data, Methods, Transport: Methods, Provenience and History: PROV-O Processing: ERSmeta

Ideas of solution

Add additional optional sections.

Example Data for testing

  • Processing: open_MaStR
  • Meassurement: PV-timeseries
  • ...

Workflow checklist

Ludee avatar Nov 19 '24 22:11 Ludee

Posssible modules are:

  • Meassurement Data
  • Transport methods
  • Advanced Provenience (PROV-O?)
  • Processing

Ludee avatar Jan 09 '25 10:01 Ludee

This is a great input for measurement data and devices: https://zenodo.org/records/6396467/files/Metadata_Schema_Persistent_ID_Instruments_Zenodo_Final.pdf

b2inst https://b2inst.gwdg.de/

Ludee avatar May 13 '25 11:05 Ludee

Add a module for data processing to describe software, tools, scripts, pipelines etc.

Ludee avatar Jul 24 '25 07:07 Ludee

Aus B2Inst kommen folgende Mandatory fields:

  • Identifier (instrument instance)
  • Path (Landing Page / URL)
  • Name (instrument)
  • OwnerName (institution)
  • ManufacturerName (instrument)

Ludee avatar Jul 24 '25 10:07 Ludee

Today we discussed this in our NFDI4Energy Meeting. We came up with the following points:

  • The B2Inst Schema can be found here: https://docs.eudat.eu/b2inst/forusers/#pid-inst-schema
  • Generally it makes sense to have a measurement and device submodules
    • The actual device is only optional information for a "measured data set" - the main characteristics of that measurement are also important - if available the information on the measurement device / instrument can be added.
  • The type of the measurement device would be helpful as well - but that requires a good ontology to get the different types that exist for instruments
    • B2Inst solved that with a free text for instrument type name that is recommended while an actual identifier for a instrument type (like from an ontology) is only optional. -> Many different domain ontologies contain different device types.
  • Additional fields of interest
    • Age of the device / maintenance
    • What is the precision margin of the device? -> Implications on uncertainty of the data
    • What units can a device actually deliver -> this allows validation of the unit of the measured data if there is a mismatch. However, data provenance might contain conversion steps (e.g. metric to imperial)
    • Measurement device metadata could be extended a lot but that might not be relevant from a context of the actual data (e.g. frequency of failures, quality of maintenance, backup power supply, etc.) -> We might need multiple metadata modules to handle this for different use cases
  • Reuse of measurement device / equipment metadata (the lab infrastructure does usually not change quickly). Provide funtionalities to reuse that accordingly
    • That would require to be a feature of the OEMeta Builder
    • Data Loading for the OEMeta Builder would be a great feature in general -> can the OEMeta Builder become a piece of standalone software?

Cpprentice avatar Aug 14 '25 12:08 Cpprentice

Data Loading for the OEMeta Builder would be a great feature in general -> can the OEMeta Builder become a piece of standalone software?

Actually, this is something that will become available within the next two months. I already started to build it as i need it for another project. It will be part of the omi tool which currently lacks 95% of its documentation .... It is used to get started with creating oemetadata (soon also from YAML files) like infer metadata from data resources, validate locally and against OEP and it offers conversion functionality to update metadata to recent versions. This tool also handles the oemetadata validation in the OEP.

There will then be a OEMetaBuilder module to create a Resource description and package them into a data package / dataset with describes multiple resources.

jh-RLI avatar Aug 14 '25 13:08 jh-RLI