guac CycloneDX Ingestion uses wrong edge type

Apologies if I've misunderstood the terminology, but I was playing around trying to import some of my own cyclonedx SBOMs and it lead me to the test for the parser, which I think expects the wrong edge type.

The sample data has components, but no explicit dependencies enumerated, so I think the edges ought to be ContainsEdge. As best I can tell, the cyclonedx code doesn't yet support actual dependency graphs. All of the edges are connected to the root package, rather than forming a DAG.

Nov 03 '22 22:11 gsoltis

Thanks for opening the issue @gsoltis !

FYI: @nadgowdas

Nov 07 '22 17:11 lumjjb

Just took a deeper look at this. To preface this, some of these definitions and still being evaluated as we go and this is part of finding a common vocabulary that makes sense across technologies and formats..

The "ContainsEdge" is very specific to what SPDX refers to as a a package containing files. Right now we are treating it as a stronger version of depends on, where contains is almost equivalent to a physical encapsulation. For queries now, we haven't had the need to use specific ones, we just always use -[Contains|DependsOn]-> for queries..

I have created a new issue label called predicates which is a longer term discussion we have on something called the "Predicate Dictionary" which will be an enumeration of the different edges and assertions we want to make on any of the entities that are part of the graph..

Hope this clarifies things!

Nov 07 '22 18:11 lumjjb

I'm not sure I follow the distinction between, e.g, a static binary's dependencies and physically containing a library. But maybe that's solved by documentation / a richer set of predicates.

At any rate, what I was looking for was to get the dependency graph for querying, rather than a dependency list. That way I'm hoping to be able to identify direct dependencies across a set of binaries that bring in transitive dependencies that have been flagged to be a problem for whatever reason. Identifying the direct dependencies gives the dev a starting point on what to change.

Is that something you intend to eventually be solvable by this project, or would that be out-of-scope?

Nov 07 '22 18:11 gsoltis

At any rate, what I was looking for was to get the dependency graph for querying, rather than a dependency list. That way I'm hoping to be able to identify direct dependencies across a set of binaries that bring in transitive dependencies that have been flagged to be a problem for whatever reason. Identifying the direct dependencies gives the dev a starting point on what to change.

This is very much in scope and is based on the fidelity of the documents described. For example, with some of the SPDX documents, we have the nested dependency tree, and thus ingesting those will provide a full dependency graph. However, if the information provided from the document is only limited to a flattened list, then we will need other sources of information to augment the relationships.

We plan on add more plugin support to ingest more information from language ecosystems and dep.dev, to further augment the information. So even if an SBOM provides just a list, we will be able to tell you more based on the other sources that we can draw from!

Nov 07 '22 19:11 lumjjb

Please let me joins this discussion to make sure I understand everything correctly.

I also played a littlebit with guac and tried to import a CycloneDX SBOM. To make sure this is a reproducable case you could run the following commands to create the same SBOM:

git clone https://github.com/quarkusio/quarkus-quickstarts.git
cd quarkus-quickstarts
mvn org.cyclonedx:cyclonedx-maven-plugin:2.7.1:makeBom
.../bin/guacone files --creds neo4j:s3cr3t target/bom.json

It seems no edges are identified in that BOM and therefore no dependency graph is built in neo4j, only single nodes are created. But as far as I understand the SBOM actually contains the dependency information which could be used to build the graph. In the SBOM it looks like this:

{
  components: [...],
  dependencies: [
    {
      ref: getting-started
      dependsOn: [resteasy-reactive]
    },
    {
      ref: resteasy-reactive
      dependsOn: [resteasy-common, ...]
    }
...
  ]
}

So based on above sample I do not think it would be required to use information of the language ecosystem or dep.dev. The details are in the SBOM but not being parsed.

When I started my tests with GUAC I actually expected to find these dependency informations in the neo4j db after import. Still not sure whether something is wrong with my expectation or the format or usage of guac :).

Nov 09 '22 13:11 albert0815

Hi @albert0815 this is probably an oversight on the parser. This is a slightly different issue from the one open here - would you mind opening a new issue and we can work to address this! We should be parsing the dependency field and creating the edges through that.

FYI @nadgowdas

Nov 09 '22 17:11 lumjjb

The fix was completed on issue #206. It also relates to the old CDX parser and architecture.

May 18 '23 14:05 pxp928

guac guac copied to clipboard

CycloneDX Ingestion uses wrong edge type

guac
guac copied to clipboard