guac
guac copied to clipboard
CycloneDX Ingestion uses wrong edge type
Apologies if I've misunderstood the terminology, but I was playing around trying to import some of my own cyclonedx SBOMs and it lead me to the test for the parser, which I think expects the wrong edge type.
The sample data has components, but no explicit dependencies enumerated, so I think the edges ought to be ContainsEdge
. As best I can tell, the cyclonedx code doesn't yet support actual dependency graphs. All of the edges are connected to the root package, rather than forming a DAG.
Thanks for opening the issue @gsoltis !
FYI: @nadgowdas
Just took a deeper look at this. To preface this, some of these definitions and still being evaluated as we go and this is part of finding a common vocabulary that makes sense across technologies and formats..
The "ContainsEdge" is very specific to what SPDX refers to as a a package containing files. Right now we are treating it as a stronger version of depends on, where contains is almost equivalent to a physical encapsulation. For queries now, we haven't had the need to use specific ones, we just always use -[Contains|DependsOn]->
for queries..
I have created a new issue label called predicates
which is a longer term discussion we have on something called the "Predicate Dictionary" which will be an enumeration of the different edges and assertions we want to make on any of the entities that are part of the graph..
Hope this clarifies things!
I'm not sure I follow the distinction between, e.g, a static binary's dependencies and physically containing a library. But maybe that's solved by documentation / a richer set of predicates.
At any rate, what I was looking for was to get the dependency graph for querying, rather than a dependency list. That way I'm hoping to be able to identify direct dependencies across a set of binaries that bring in transitive dependencies that have been flagged to be a problem for whatever reason. Identifying the direct dependencies gives the dev a starting point on what to change.
Is that something you intend to eventually be solvable by this project, or would that be out-of-scope?
At any rate, what I was looking for was to get the dependency graph for querying, rather than a dependency list. That way I'm hoping to be able to identify direct dependencies across a set of binaries that bring in transitive dependencies that have been flagged to be a problem for whatever reason. Identifying the direct dependencies gives the dev a starting point on what to change.
This is very much in scope and is based on the fidelity of the documents described. For example, with some of the SPDX documents, we have the nested dependency tree, and thus ingesting those will provide a full dependency graph. However, if the information provided from the document is only limited to a flattened list, then we will need other sources of information to augment the relationships.
We plan on add more plugin support to ingest more information from language ecosystems and dep.dev, to further augment the information. So even if an SBOM provides just a list, we will be able to tell you more based on the other sources that we can draw from!
Please let me joins this discussion to make sure I understand everything correctly.
I also played a littlebit with guac and tried to import a CycloneDX SBOM. To make sure this is a reproducable case you could run the following commands to create the same SBOM:
git clone https://github.com/quarkusio/quarkus-quickstarts.git
cd quarkus-quickstarts
mvn org.cyclonedx:cyclonedx-maven-plugin:2.7.1:makeBom
.../bin/guacone files --creds neo4j:s3cr3t target/bom.json
It seems no edges are identified in that BOM and therefore no dependency graph is built in neo4j, only single nodes are created. But as far as I understand the SBOM actually contains the dependency information which could be used to build the graph. In the SBOM it looks like this:
{
components: [...],
dependencies: [
{
ref: getting-started
dependsOn: [resteasy-reactive]
},
{
ref: resteasy-reactive
dependsOn: [resteasy-common, ...]
}
...
]
}
So based on above sample I do not think it would be required to use information of the language ecosystem or dep.dev. The details are in the SBOM but not being parsed.
When I started my tests with GUAC I actually expected to find these dependency informations in the neo4j db after import. Still not sure whether something is wrong with my expectation or the format or usage of guac :).
Hi @albert0815 this is probably an oversight on the parser. This is a slightly different issue from the one open here - would you mind opening a new issue and we can work to address this! We should be parsing the dependency field and creating the edges through that.
FYI @nadgowdas
The fix was completed on issue #206. It also relates to the old CDX parser and architecture.