spdx-3-model icon indicating copy to clipboard operation
spdx-3-model copied to clipboard

How to extract the NamespaceMap for SpdxDocument from RDF

Open maxhbr opened this issue 2 years ago • 2 comments

Since https://github.com/spdx/spdx-3-model/pull/491 was merged, it is a task for tools to extract the namespace map out of the native namespaces present in RDF / JSON-LD. I tried that and have not yet found the right approach.

The Issue

There are many unrelated namespaces coming from RDF overhead and some from the SPDX spec, that are present even in an empty document:

DEBUG:   namespace: brick -> https://brickschema.org/schema/Brick#
DEBUG:   namespace: csvw -> http://www.w3.org/ns/csvw#
DEBUG:   namespace: dc -> http://purl.org/dc/elements/1.1/
DEBUG:   namespace: dcat -> http://www.w3.org/ns/dcat#
DEBUG:   namespace: dcmitype -> http://purl.org/dc/dcmitype/
DEBUG:   namespace: dcterms -> http://purl.org/dc/terms/
DEBUG:   namespace: dcam -> http://purl.org/dc/dcam/
DEBUG:   namespace: doap -> http://usefulinc.com/ns/doap#
DEBUG:   namespace: foaf -> http://xmlns.com/foaf/0.1/
DEBUG:   namespace: geo -> http://www.opengis.net/ont/geosparql#
DEBUG:   namespace: odrl -> http://www.w3.org/ns/odrl/2/
DEBUG:   namespace: org -> http://www.w3.org/ns/org#
DEBUG:   namespace: prof -> http://www.w3.org/ns/dx/prof/
DEBUG:   namespace: prov -> http://www.w3.org/ns/prov#
DEBUG:   namespace: qb -> http://purl.org/linked-data/cube#
DEBUG:   namespace: schema -> https://schema.org/
DEBUG:   namespace: sh -> http://www.w3.org/ns/shacl#
DEBUG:   namespace: skos -> http://www.w3.org/2004/02/skos/core#
DEBUG:   namespace: sosa -> http://www.w3.org/ns/sosa/
DEBUG:   namespace: ssn -> http://www.w3.org/ns/ssn/
DEBUG:   namespace: time -> http://www.w3.org/2006/time#
DEBUG:   namespace: vann -> http://purl.org/vocab/vann/
DEBUG:   namespace: void -> http://rdfs.org/ns/void#
DEBUG:   namespace: wgs -> https://www.w3.org/2003/01/geo/wgs84_pos#
DEBUG:   namespace: owl -> http://www.w3.org/2002/07/owl#
DEBUG:   namespace: rdf -> http://www.w3.org/1999/02/22-rdf-syntax-ns#
DEBUG:   namespace: rdfs -> http://www.w3.org/2000/01/rdf-schema#
DEBUG:   namespace: xsd -> http://www.w3.org/2001/XMLSchema#
DEBUG:   namespace: xml -> http://www.w3.org/XML/1998/namespace
DEBUG:   namespace: ai -> https://spdx.org/rdf/v3/AI/
DEBUG:   namespace: build -> https://spdx.org/rdf/v3/Build/
DEBUG:   namespace: core -> https://spdx.org/rdf/v3/Core/
DEBUG:   namespace: dataset -> https://spdx.org/rdf/v3/Dataset/
DEBUG:   namespace: expandedlicensing -> https://spdx.org/rdf/v3/ExpandedLicensing/
DEBUG:   namespace: licensing -> https://spdx.org/rdf/v3/Licensing/
DEBUG:   namespace: ns0 -> http://www.w3.org/2003/06/sw-vocab-status/ns#
DEBUG:   namespace: security -> https://spdx.org/rdf/v3/Security/
DEBUG:   namespace: simplelicensing -> https://spdx.org/rdf/v3/SimpleLicensing/
DEBUG:   namespace: software -> https://spdx.org/rdf/v3/Software/

This makes it hard to identify the manually introduced namespaces.

Question: how would one extract the part of that mapping, which was intentional and decided by the creator of the document?

I see no easy answer here.

maxhbr avatar Nov 21 '23 16:11 maxhbr

One approach would be to filter out all namespaces that are in the standard SPDX context file. This would leave you with additional namespaces added beyond the SPDX spec.

Another approach would be to filter out all namespaces that are known to be part of property and type specifications.

I also don't think it would be an issue to include these additional namespaces in the namespace map - even though they may be redundant with the context file.

goneall avatar Nov 30 '23 19:11 goneall

Moving to 3.1 - if there is a need to document this, we can add it in that release

goneall avatar Apr 03 '24 20:04 goneall