Support for regex in filter object while running /sbom
Discussed in https://github.com/CycloneDX/cdxgen/discussions/1261
Originally posted by arkajnag23 July 23, 2024
I am using CDXGEN server mode and using POST method (/sbom) to generate the SBOM.
I have multi module Maven projects which includes Angular JS + Maven; CDXGEN seems to generate the sbom with required components.
But now, I want to exclude certain groups/artifacts to generate the filtered SBOM.
As I want to exclude hence I tried using negative lookahead with regular expression, like below:
curl -X POST http://localhost:9090/sbom \ -H "Content-Type: application/json" \ -d '{ "path": "/var/<workspace path>", "type": "maven,js", "multiProject": true, "resolveTransitive": true, "recurse": true, "installDeps": true, "filter": "^(?!.*(abc\\|test|)).*$" }'
While going through the source code, it seems the filterBom method, doesn't support Regular Expression.
Can someone provide some support on the same?
This is correct. exclude is not supported for server mode. Best way to move this forward is to find a contributor, since this is a non-trivial effort.
@arkajnag23 could you try using the exclude attribute with the latest version?
@prabhu Are you referring to 10.8.9?
Thanks @prabhu for adding exclude in server mode; but this actually not resolving my issue. As shared in the documentation, exclude is mainly to remove files and directories. So I was trying to find a option via filter where I can try to exclude packages or like something to exclude group-id or artifact id. Like : excludeGroups or excludeArtifacts type. Reason: When we are scanning and analyzing and generating the SBOM, it generates SBOM for our internal libraries/dependencies as well, which shouldn't be a part of final report.
As filter supports to provide package details, hence my initial attempt was to use regex and negative lookahead to remove what I don't need to be in filtered SBOM. I understand as well, we have support of export MVN_ARGS, but that can bring complexities of its own.
Filter is an array of strings where you can pass any part of a purl like group or package name; even maven and gradle profile names.
https://github.com/CycloneDX/cdxgen/blob/be4e4f424a984fc96cd63173d14cd13098dd2865/lib/server/openapi.yaml#L263
@prabhu If my understanding is correct , filter accepts array of String of packages what we want to include/extract and not what we want to exclude. The number of packages what we want to exclude is limited, whereas what we want to include can be unlimited.
If filterBom method can support REGEX then it would be really useful to define something like this: "^(?!.(abc\|test|)).$" So, that filter will know, what not to include.
Filter is to exclude. Only is to include. Can you give it a try please?
Thanks @prabhu for clarifying.
But purl contains check seems to be happening on other field rather than purl object.
Correct me if am wrong here:
Rather than verifying on :
When I used this CURL request, then I am getting many dependencies being analyzed.
curl -X POST http://localhost:9090/sbom -H "Content-Type: application/json" -d '{
"path": "/var/EventHub/event-hub-core/event-hub-core/",
"type": "maven,js",
"multiProject": true,
"resolveTransitive": true,
"recurse": true,
"installDeps": true,
"filter" : ["grid.runtime","event-hub-tests"]
}' > event-hub-sbom.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1473k 0 1472k 100 274 8070 1 0:04:34 0:03:06 0:01:28 386k
whereas when using , the result is very different
curl -X POST http://localhost:9090/sbom -H "Content-Type: application/json" -d '{
"path": "/var/EventHub/event-hub-core/event-hub-core/",
"type": "maven,js",
"multiProject": true,
"resolveTransitive": true,
"recurse": true,
"installDeps": true,
"filter" : ["grid.runtime","event-hub","event-hub-tests"]
}' > event-hub-sbom-1.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1578 0 1292 100 286 6 1 0:04:46 0:03:11 0:01:35 335
As the verification is happening on different purl -> value.
@prabhu Together with my above comment, want to clarify why filter was designed like: Use --filter to filter components containing the string in the purl or components.properties.value. Because in such cases, where components.properties.value are matched then most of the documents will get excluded/filtered and won't even allowed to focus on actual dependencies. Isn't components.properties.value can be supported by exclude?
Was there any specific business requirement to have the filter check on components.properties.value