d3fend-ontology
d3fend-ontology copied to clipboard
Package URL
d3f:PackageURL
I'm not exactly sure how to best model this yet, but I think it's important to have a way to represent package URLs and be able to make queries based on the type of the package (Maven, NPM, etc.) and its data components (namespace, name, version, qualifiers, subpath).
Before I open a PR for this I wanted to get feedback on the approach I'm taking and if this would be useful to others. I plan on using these properties to annotate software composition analysis with D3FEND and would like to use PURLs to uniquely identify software packages across databases.
Definition
@prefix d3f: <http://d3fend.mitre.org/ontologies/d3fend.owl#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
d3f:PackageURL a owl:Class ;
rdfs:label "Package URL" ;
rdfs:subClassOf d3f:Identifier,
[ a owl:Restriction ;
owl:onProperty d3f:identifies ;
owl:someValuesFrom d3f:SoftwarePackage
],
[ a owl:Restriction ;
owl:onProperty d3f:package-type ;
owl:someValuesFrom d3f:PackageURLType
] ;
[ a owl:Restriction ;
owl:onProperty d3f:package-namespace ;
owl:someValuesFrom xsd:string
],
[ a owl:Restriction ;
owl:onProperty d3f:package-name ;
owl:someValuesFrom xsd:string
],
[ a owl:Restriction ;
owl:onProperty d3f:package-version ;
owl:someValuesFrom xsd:string
],
[ a owl:Restriction ;
owl:onProperty d3f:package-qualifiers ;
owl:someValuesFrom xsd:string
],
[ a owl:Restriction ;
owl:onProperty d3f:package-subpath ;
owl:someValuesFrom xsd:string
] ;
d3f:definition """purl stands for package URL.
A purl is a URL composed of seven components:
scheme:type/namespace/name@version?qualifiers#subpath
Components are separated by a specific character for unambiguous parsing.
The definition for each components is:
* scheme: this is the URL scheme with the constant value of \"pkg\". This is not modeled in the RDF representation as it is tautological.
* type: the package \"type\" or package \"protocol\" such as maven, npm, nuget, gem, pypi, etc. Required.
* namespace: some name prefix such as a Maven groupid, a Docker image owner, a GitHub user or organization. Optional and type-specific.
* name: the name of the package. Required.
* version: the version of the package. Optional.
* qualifiers: extra qualifying data for a package such as an OS, architecture, a distro, etc. Optional and type-specific.
* subpath: extra subpath within a package, relative to the package root. Optional.
Components are designed such that they form a hierarchy from the most significant component on the left to the least significant component on the right.""" ;
rdfs:isDefinedBy <https://github.com/package-url/purl-spec/blob/master/PURL-SPECIFICATION.rst> ;
rdfs:seeAlso <https://github.com/package-url/purl-spec> .
d3f:PackageURLType a owl:Class ;
rdfs:label "Package URL Type" ;
rdfs:seeAlso <https://github.com/package-url/purl-spec> ;
d3f:definition """Each package manager, platform, type, or ecosystem has its own conventions and protocols to identify, locate, and provision software packages.
The package type is the component of a package URL that is used to capture this information with a short string such as maven, npm, nuget, gem, pypi, etc.""" .
# Example individual for Maven, the PURL structured value for the type is stored using the rdf:value property
d3f:PackageURLType-Maven a owl:NamedIndividual, d3f:PackageURLType ;
rdfs:label "Maven" ;
rdf:value "maven" ;
rdfs:isDefinedBy <https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#maven> ;
rdfs:comment "maven for Maven JARs and related artifacts" ;
skos:example "pkg:maven/org.apache.xmlgraphics/[email protected]",
"pkg:maven/org.apache.xmlgraphics/[email protected]?type=pom",
"pkg:maven/org.apache.xmlgraphics/[email protected]?classifier=sources",
"pkg:maven/org.apache.xmlgraphics/[email protected]?type=zip&classifier=dist",
"pkg:maven/net.sf.jacob-projec/[email protected]?classifier=x86&type=dll",
"pkg:maven/net.sf.jacob-projec/[email protected]?classifier=x64&type=dll" ;
rdfs:seeAlso <https://repo.maven.apache.org/maven2> ;
d3f:definition """The default repository is https://repo.maven.apache.org/maven2.
The group id is the ``namespace`` and the artifact id is the ``name``.
Known qualifiers keys are: ``classifier`` and ``type`` as defined in the POM documentation. Note that Maven uses a concept / coordinate called packaging which does not map directly 1:1 to a file extension. In this use case, we need to construct a link to one of many possible artifacts. Maven itself uses type in a dependency declaration when needed to disambiguate between them.""" .
d3f:package-property a owl:ObjectProperty ;
rdfs:subPropertyOf d3f:associated-with ;
rdfs:domain d3f:SoftwarePackage ;
rdfs:label "package-property" ;
d3f:definition "x package-property y: The package x has the object property y." .
d3f:package-data-property a owl:DatatypeProperty ;
rdfs:subPropertyOf d3f:d3fend-artifact-data-property ;
rdfs:domain d3f:SoftwarePackage ;
rdfs:label "package-data-property" ;
d3f:definition "x package-data-property y: The package x has the data property y." .
d3f:package-type a owl:ObjectProperty, owl:FunctionalProperty ;
rdfs:subPropertyOf d3f:package-property ;
rdfs:range d3f:PackageURLType ;
rdfs:label "package-type" ;
skos:example "maven", "npm", "nuget", "gem", "pypi" ;
d3f:definition "x package-type y: The package x has the type y." .
d3f:package-namespace a owl:DatatypeProperty, owl:FunctionalProperty ;
rdfs:subPropertyOf d3f:package-data-property ;
rdfs:label "package-namespace" ;
d3f:definition "x package-namespace y: The package x has the namespace y." .
d3f:package-name a owl:DatatypeProperty, owl:FunctionalProperty ;
rdfs:subPropertyOf d3f:package-data-property ;
rdfs:label "package-name" ;
d3f:definition "x package-name y: The package x has the name y." .
d3f:package-version a owl:DatatypeProperty, owl:FunctionalProperty ;
rdfs:subPropertyOf d3f:package-data-property ;
rdfs:label "package-version" ;
d3f:definition "x package-version y: The package x has the version y." .
d3f:package-qualifiers a owl:ObjectProperty ;
rdfs:subPropertyOf d3f:package-data-property ;
rdfs:label "package-qualifiers" ;
skos:example "arch=i386", "platform=java", "repository_url=gcr.io" ;
d3f:definition "x package-qualifiers y: The package x has the qualifiers y." .
d3f:package-subpath a owl:DatatypeProperty, owl:FunctionalProperty ;
rdfs:subPropertyOf d3f:package-data-property ;
rdfs:label "package-subpath" ;
d3f:definition "x package-subpath y: The package x has the subpath y." .
d3f:purl a owl:ObjectProperty, owl:InverseFunctionalProperty ;
rdfs:subPropertyOf d3f:package-property ;
rdfs:range xsd:anyURI ;
rdfs:label "purl" ;
d3f:definition "A package URL (purl) is a URL for identifying software packages." .
Examples
<pkg:bitbucket/birkenfeld/pygments-main@244fd47e07d1014f0aed9c> a d3f:PackageURL .
<pkg:deb/debian/[email protected]?arch=i386&distro=jessie> a d3f:PackageURL .
<pkg:docker/cassandra@sha256:244fd47e07d1004f0aed9c> a d3f:PackageURL .
<pkg:docker/customer/dockerimage@sha256:244fd47e07d1004f0aed9c?repository_url=gcr.io>
a d3f:PackageURL .
asserting triples:
:jruby-launcher d3f:purl <pkg:gem/[email protected]?platform=java> .
could materialize:
<pkg:gem/[email protected]?platform=java> a d3f:SoftwarePackage .
<#curl-debian> a d3f:SoftwarePackage ;
d3f:package-type d3f:PackageURLType-Deb ;
d3f:package-namespace "debian" ;
d3f:package-name "curl" ;
d3f:package-version "7.50.3-1" ;
d3f:package-qualifiers "arch=i386", "distro=jessie" .
<#batik-anim> a d3f:SoftwarePackage ;
d3f:package-type d3f:PackageURLType-Maven ;
d3f:package-namespace "org.apache.xmlgraphics" ;
d3f:package-name "batik-anim" ;
d3f:package-version "1.9.1" .
References
- https://github.com/package-url/purl-spec
Definition
@prefix d3f: <http://d3fend.mitre.org/ontologies/d3fend.owl#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . d3f:PackageURL a owl:Class ; rdfs:label "Package URL" ; rdfs:subClassOf d3f:Identifier,
^ Should be subclass of d3f:URL
[ a owl:Restriction ; owl:onProperty d3f:identifies ; owl:someValuesFrom d3f:SoftwarePackage ], [ a owl:Restriction ; owl:onProperty d3f:package-type ; owl:someValuesFrom d3f:PackageURLType
...
skos:example "pkg:maven/org.apache.xmlgraphics/[email protected]", "pkg:maven/org.apache.xmlgraphics/[email protected]?type=pom", "pkg:maven/org.apache.xmlgraphics/[email protected]?classifier=sources", "pkg:maven/org.apache.xmlgraphics/[email protected]?type=zip&classifier=dist", "pkg:maven/net.sf.jacob-projec/[email protected]?classifier=x86&type=dll", "pkg:maven/net.sf.jacob-projec/[email protected]?classifier=x64&type=dll" ; rdfs:seeAlso <https://repo.maven.apache.org/maven2> ; d3f:definition """The default repository is https://repo.maven.apache.org/maven2. The group id is the ``namespace`` and the artifact id is the ``name``. Known qualifiers keys are: ``classifier`` and ``type`` as defined in the POM documentation. Note that Maven uses a concept / coordinate called packaging which does not map directly 1:1 to a file extension. In this use case, we need to construct a link to one of many possible artifacts. Maven itself uses type in a dependency declaration when needed to disambiguate between them.""" .
d3f:package-property a owl:ObjectProperty ; rdfs:subPropertyOf d3f:associated-with ;
^^ This is key and something we've needed to decide on. This issue is forcing me to decide :)
These are more "schema" oriented fields versus our intent with d3f:associated-with. Associated for us is sort of short hand for "inferentially associated with". We use these to produce all of our various inferred relationships. We almost need a high-level property called something like d3f:schema-property
to indicate its a bit outside of our general model. This is where we'd want to fold in OCSF fields as well, I think you've had some other content that might fall under there, you had a OCSF property generation script. I'd like to see them be in our proper namespace, but with links back to OCSF. CC @hack-sentinel
In general most of these we'd want to be in sync with OCSF. If we uncover an issue with OCSF, we should engage them to help improve OCSF to make it ontologically sound.
rdfs:domain d3f:SoftwarePackage ; rdfs:label "package-property" ; d3f:definition "x package-property y: The package x has the object property y." .
d3f:package-data-property a owl:DatatypeProperty ; rdfs:subPropertyOf d3f:d3fend-artifact-data-property ; rdfs:domain d3f:SoftwarePackage ; rdfs:label "package-data-property" ; d3f:definition "x package-data-property y: The package x has the data property y." .
....
Thank you for your feedback! Will be taking what you said into consideration and come up with a pass at this I'll throw at a PR.
Here is the OCSF extension branch for reference (a bit out of date, for 1.1.0-dev, but same concepts apply): https://github.com/aamedina/d3fend-ontology/tree/aamedina/ocsf/extensions/ocsf, which uses SPARQL-Generate to produce https://github.com/aamedina/d3fend-ontology/blob/aamedina/ocsf/extensions/ocsf/dataset/ocsf.ttl
An advantage of OCSF's dictionary design is that is plays well with the RDF notion of how properties are related to classes. OCSF classes can have their own "restrictions" which pull in OCSF dictionary properties. We don't need to have process specific or package specific properties, but we could have one single super property that indicates OCSF alignment and have the process/package etc properties (whatever they end up looking like) inherit from that (like d3f:schema-property
or d3f:ocsf-property
to indicate OCSF provenance for the property semantics.)
Schema property could be used to unify and relate the various schemas, thus generic d3f:schema property
is more appropriate.
If we intend to assert provenance, we can do an rdfs:definedBy