spdx-spec
spdx-spec copied to clipboard
ontology/ directory is no longer updated - suggest deletion
Content in https://github.com/spdx/spdx-spec/tree/development/v3.0.1/ontology is outdated.
One of the files (ontology/model.plantuml) was generated by spec-parser on 2024-02-22, before the release of 3.0 (2024-04-15).
It contains outdated concepts like Software/isDirectory and Dataset/sensitivePersonalInformation; also does not contain new concepts like Dataset/DatasetPackage and AI/energyUnit.
The ontology live on spdx.org is well updated. All these files are good:
- https://spdx.org/rdf/3.0.0/spdx-model.ttl
- https://spdx.org/rdf/3.0.0/spdx-context.jsonld
- https://spdx.org/schema/3.0.0/spdx-json-schema.json
Should we update this ontology/ directory or should we just delete it to avoid confusion?
Every run of the spec-parser generates an updated ontology in all formats, including the PlantUML diagram.
The generated files should be copied in the ontology directory.
How we are going to communicate about these "alternatives"? (Like, when to use which source)
- https://spdx.org/rdf/3.0.0/spdx-model.ttl (official)
- https://spdx.github.io/spdx-3-model/model.ttl
- Same file as (?) https://github.com/spdx/spdx-3-model/blob/gh-pages/model.ttl (
gh-pagesbranch is suggested in spdx/spdx-3-model README and this exact URL is used in spdx-3-model/serialization/json-ld.md)
- Same file as (?) https://github.com/spdx/spdx-3-model/blob/gh-pages/model.ttl (
- https://github.com/spdx/spdx-spec/blob/development/v3.0.1/ontology/ontology.rdf.ttl
I realized that there should be at least two sources: one is stable (generated at release for public use), another is for development (generated frequently for testing/development).
The stable one should be at spdx.org (1) and all spec/doc should point to that new IRIs.
The question is probably where should be the source for the development.
For example, in spdx-3-model repo (2) or in spdx-spec repo (3).
Once decided to pick one, we should delete another to avoid confusion. Also after that, README and etc should be updated to make it clear that this is unstable.
The current situation is a not very friendly for new comer.
The same concern of having multiple sources of truth (and not keep them updated) that happens here for spdx-spec repo goes to spdx-3-model repo as well. See: https://github.com/spdx/spdx-3-model/issues/726
From the tech call - we agreed the ontology directory can be deleted.
The gh-pages version is being generated from model repo.
@zvr - let me know if you agree that it can be deleted
@goneall I admit that I like having a directory where one can see everything as files, as we had until now at https://github.com/spdx/spdx-spec/tree/development/v2.3/ontology
Having the individual files available elsewhere is useful but not equivalent.
But maybe I misunderstood what you wrote. Was the question to me about the ontology directory here? Or the gh-pages stuff in spdx-3-model? Reading the minutes it seems only deletion of spdx-3-model/gh-pages is mentioned.
Anyway, my views:
- delete the spdx-3-model gh-pages stuff
- keep the ontology directory here and populate it with the RDF ontology in different forms, etc.
- also have the diagrams (both automatically generated and hand-curated ones) somewhere in the spdx-spec repo.
(and remember, we still have to generate RDF documentation and publish that one)
If we want to keep this directory, we should have a CI which copies any generated files or original files from the spdx-3-model repo. Doing it manually is leading to omissions and inconsistencies.
The ontology/ directory is deleted by PR #963.
We can close this issue, or we can leave this open to make this serve as a reminder for adding the directory back once the CI is ready per this comment: https://github.com/spdx/spdx-spec/pull/963#issuecomment-2125493547
I've opened #979 for the task of keeping the generated ontology.