spdx-examples
spdx-examples copied to clipboard
Add a Dataset Profile example (CO2 dataset)
Inspired from https://github.com/owid/co2-data/
Structurally and semantically validated against all tools here: https://github.com/spdx/spdx-3-model/blob/main/serialization/json_ld/validation.md
TODOs:
- [x] Valid JSON-LD
- [x] Parseable/make sense on https://json-ld.org/playground/
- [x] Pass validation/generation of spdx3ToGraph
- [x] Pass pyshacl
- [x] Pass check-jsonschema
- [x] Pass ajv
Experiment notes:
- Successfully Validated with all the tools (
ajv,check-jsonschema,pyshacl) listed at https://github.com/spdx/spdx-3-model/blob/main/serialization/json_ld/validation.md - Warning messages of
ajvandcheck-jsonschemasometimes are not very helpful- If there're warnings/errors, trying to remove some objects from your JSON. Once the smaller JSON got validated, gradually add few more.
spdx3ToGraphcan be more handy to detect errors in the first runs as it can provide more useful error messages (use this https://github.com/maxhbr/spdx3ToGraph/pull/2 to get more exact location of error)- But
spdx3ToGraphvalidation will not check the cardinality, you still have to useajvorcheck-jsonschemafor that. (The tool is meant primarily for visualization btw)- If maxCount is
*, the data type must be an array
- If maxCount is
- A lot of errors found in this try (and in few other examples) is about serialized names. So if TODO in https://github.com/spdx/spdx-spec/issues/975 is completed, it will help a lot.
- A PlantUML diagram, generated from
spdx3ToGraphcan be useful to understand the overall structure- However, due to limitation of PlantUML visualizer (I use ones from PlantUML.com, online and offline), if you have a very long spdxId (based on UUIDv4, for example), your diagram are very likely to be overflowed/got cropped.
- For this example, I edited the generated PlantUML file to have shorter spdxIds before I submit it again to the visualizer. Just to have a diagram that actually fit. (The IDs in JSON-LD file are untouched)
- A real-time validation in an editor would help. VS Code supports JSON validation with a schema. If you familiar with VS Code, please help review this PR to add VS Code validation to the validation document https://github.com/spdx/spdx-3-model/pull/790
This this how I put AnyLicneseInfo in this example, to please the SHACL validator -- as a workaround for the lacking of ListedLicense at this moment. (This will be removed once https://github.com/spdx/LicenseListPublisher/issues/183 is implemented).
LicenseExpression is a subclass of AnyLicenseInfo and is valid to be used a to in "has license" relationships.
The spdxId is set to be identical to an expected license IRI. This means when the license (CC-BY-4.0) is available as ListedLicense (and use this IRI), this LicenseExpression workaround element can be removed without any need to make change in "has license" relationships.
{
"type": "simplelicensing_LicenseExpression",
"spdxId": "https://spdx.org/licenses/CC-BY-4.0",
"creationInfo": "_:creationinfo",
"simplelicensing_licenseExpression": "CC-BY-4.0",
"simplelicensing_licenseListVersion": "3.24.0"
}
--
For this example, we can decide which version of BOM we would like to have:
- a BOM that is valid as of actual ontology (current ontology without
ListedLicense) - a BOM that is valid as of ontology as designed (future ontology with
ListedLicense)
There can be 3 decision options:
a. If (1) is ok, we can merge as it is. And once https://github.com/spdx/LicenseListPublisher/issues/183 is implemented, we can revise the BOM again to remove the workaround LicenseExpression.
b. If (2) is preferred, I can remove the workaround LicenseExpression element now, so it can get merge (after other necessary revisions).
c. Last option is doing nothing until we have all the required ListedLicense and then go with (2).
@rgopikrishnan91 this PR please Gopi
@rgopikrishnan91 @bennetkl please kindly review. Thank you.
@kestewart I believe you will have to merge, I don't have permission for this one.
DIscussed in AI call. Merging.