guac icon indicating copy to clipboard operation
guac copied to clipboard

[feature] Improve CDX parsing

Open pxp928 opened this issue 1 year ago • 5 comments

Is your feature request related to a problem? Please describe. Currently, the CDX parser appends all dependencies to the top level package which may be inaccurate:

https://github.com/guacsec/guac/blob/main/pkg/ingestor/parser/cyclonedx/parser_cyclonedx.go#L243-L262

Instead, this needs to be done based on the relationships defined by the CDX SBOM.

Describe the solution you'd like

This is not based on the relationship so that can be inaccurate (can capture both direct and in-direct).
This needs to be done by utilizing *c.cdxBom.Dependencies 

see https://github.com/CycloneDX/specification/issues/33

Describe alternatives you've considered None

pxp928 avatar Apr 30 '24 12:04 pxp928

Hey @pxp928, I would like to work on this. Just wanted to ask here how we use the direct dependencies that we get from *c.cdxBom.Dependencies?

Yaxhveer avatar May 09 '24 19:05 Yaxhveer

Hey @Yaxhveer!

So c.cdxBom.Dependencies is implemented https://github.com/guacsec/guac/blob/main/pkg/ingestor/parser/cyclonedx/parser_cyclonedx.go#L285-L309 but we are not following the logic set forth by https://github.com/CycloneDX/specification/issues/33#issuecomment-653169620. We want to be able to check if this dependency is direct, in-direct (transitive) or both. It can be both a direct and in-direct dependency in which case there will be two assembler.IsDependencyIngest created for the direct one and the in-direct one. Currently, we are marking them all as unknown: https://github.com/guacsec/guac/blob/main/pkg/ingestor/parser/common/helpers.go#L53

It should be unknown by default, but if you can determine it via the logic provided, it should be specified as Direct or In-direct.

Let me know if you have other questions! Looking forward to the contribution :)

pxp928 avatar May 09 '24 19:05 pxp928

Hey @pxp928, just a simple doubt here the dependency type is between top level and foundPkg or between foundPkg and DepPkg (in func GetIsDep()) https://github.com/guacsec/guac/blob/4dc7c94c4aa29eda038e1194342c324ad5962861/pkg/ingestor/parser/common/helpers.go#L49-L56

Yaxhveer avatar May 10 '24 02:05 Yaxhveer

Based off this https://github.com/CycloneDX/specification/issues/33#issuecomment-653169620 it can be both. So if c.cdxBom.Dependencies is defined (not null), we should not do this: https://github.com/guacsec/guac/blob/main/pkg/ingestor/parser/cyclonedx/parser_cyclonedx.go#L260 but instead use the relationships defined to create both the top level and foundPkg (direct) and between foundPkg and DepPkg (in-direct) in the function:

https://github.com/guacsec/guac/blob/4dc7c94c4aa29eda038e1194342c324ad5962861/pkg/ingestor/parser/cyclonedx/parser_cyclonedx.go#L285-L309

In that case you will have to remove this lines:

if reflect.DeepEqual(currPkg, toplevel) {
	continue
}

as we dont want to skip.

If c.cdxBom.Dependencies is null then keep doing https://github.com/guacsec/guac/blob/main/pkg/ingestor/parser/cyclonedx/parser_cyclonedx.go#L260 like it is currently.

pxp928 avatar May 10 '24 03:05 pxp928

Thanks just wanted to confirm that.

Yaxhveer avatar May 10 '24 04:05 Yaxhveer