guac
guac copied to clipboard
[feature] Improve CDX parsing
Is your feature request related to a problem? Please describe. Currently, the CDX parser appends all dependencies to the top level package which may be inaccurate:
https://github.com/guacsec/guac/blob/main/pkg/ingestor/parser/cyclonedx/parser_cyclonedx.go#L243-L262
Instead, this needs to be done based on the relationships defined by the CDX SBOM.
Describe the solution you'd like
This is not based on the relationship so that can be inaccurate (can capture both direct and in-direct).
This needs to be done by utilizing *c.cdxBom.Dependencies
see https://github.com/CycloneDX/specification/issues/33
Describe alternatives you've considered None
Hey @pxp928, I would like to work on this. Just wanted to ask here how we use the direct dependencies that we get from *c.cdxBom.Dependencies?
Hey @Yaxhveer!
So c.cdxBom.Dependencies is implemented https://github.com/guacsec/guac/blob/main/pkg/ingestor/parser/cyclonedx/parser_cyclonedx.go#L285-L309 but we are not following the logic set forth by https://github.com/CycloneDX/specification/issues/33#issuecomment-653169620. We want to be able to check if this dependency is direct, in-direct (transitive) or both. It can be both a direct and in-direct dependency in which case there will be two assembler.IsDependencyIngest created for the direct one and the in-direct one. Currently, we are marking them all as unknown: https://github.com/guacsec/guac/blob/main/pkg/ingestor/parser/common/helpers.go#L53
It should be unknown by default, but if you can determine it via the logic provided, it should be specified as Direct or In-direct.
Let me know if you have other questions! Looking forward to the contribution :)
Hey @pxp928, just a simple doubt here the dependency type is between top level and foundPkg or between foundPkg and DepPkg (in func GetIsDep()) https://github.com/guacsec/guac/blob/4dc7c94c4aa29eda038e1194342c324ad5962861/pkg/ingestor/parser/common/helpers.go#L49-L56
Based off this https://github.com/CycloneDX/specification/issues/33#issuecomment-653169620 it can be both. So if c.cdxBom.Dependencies is defined (not null), we should not do this: https://github.com/guacsec/guac/blob/main/pkg/ingestor/parser/cyclonedx/parser_cyclonedx.go#L260 but instead use the relationships defined to create both the top level and foundPkg (direct) and between foundPkg and DepPkg (in-direct) in the function:
https://github.com/guacsec/guac/blob/4dc7c94c4aa29eda038e1194342c324ad5962861/pkg/ingestor/parser/cyclonedx/parser_cyclonedx.go#L285-L309
In that case you will have to remove this lines:
if reflect.DeepEqual(currPkg, toplevel) {
continue
}
as we dont want to skip.
If c.cdxBom.Dependencies is null then keep doing https://github.com/guacsec/guac/blob/main/pkg/ingestor/parser/cyclonedx/parser_cyclonedx.go#L260 like it is currently.
Thanks just wanted to confirm that.