vulnerablecode icon indicating copy to clipboard operation
vulnerablecode copied to clipboard

Low number of findings compared to commercial tool

Open oheger-bosch opened this issue 4 years ago • 10 comments

Hi all,

for OSS compliance and vulnerability reports we use the OSS review toolkit (ORT). The ORT advisor component currently supports querying vulnerability information from VulnerableCode and Sonatype Nexus IQ, which we both use. We host an instance of VulnerableCode and run the importers on a regular schedule.

With this setup in place for about half a year, I did an evaluation of the findings returned by VulnerableCode and Nexus IQ based on the results produced by ORT. The outcome is that the number of findings reported by VulnerableCode is significantly lower than for Nexus IQ, particularly for certain types of packages (NPM, Python, Maven). Find below an excerpt from the results. (The "Packages" column contains the number of packages for which at least one security vulnerability has been reported by one of the systems.)

Type     Packages  Findings IQ  Findings VC
Crate      23           19           17
Gem        22           67           25
Maven     929         2072         1044
NPM       644         1213           48
NuGet      40           42           29
PyPi       68          209           21
All      1729         3639         1184

The projects that have been scanned by ORT to produce these numbers are currently ongoing software development projects. I assume they use a typical set of library dependencies with up-to-date versions.

Now I am trying to investigate the reasons for these differences. What I have tried so far is the following:

  • I checked that the importers are actually running successfully and populate the database. At least, I did not see any suspicious logs during the execution. Here are some figures regarding the number of packages in our database generated by the command
SELECT vp."type" pt, COUNT(*)
FROM vulnerabilities_package vp
GROUP BY pt
Maven   21225
NPM     12077
PyPi    13768

Does this look plausible or do we miss relevant data from sources?

  • ORT queries VulnerableCode via the Bulk API passing in a list of PURLs for the packages in question. To rule out bugs in the interaction between these tools, I queried the VulnerableCode API manually, but came to similar results.
  • I tried to match the packages found by ORT directly in the VulnerableCode database, circumventing the API; but again, I did not find more matches.

So, the question is, do you have any ideas/suggestions what could be the cause for this low number of findings? Is our database corrupt or is VulnerableCode missing important sources of vulnerability information? Any help would be appreciated.

oheger-bosch avatar Oct 26 '21 06:10 oheger-bosch

@oheger-bosch Thank you ++ for this detailed report!

Could you provide a list of purl/CVEs combos that you found missing in VulnerableCode? (If it is big you can send these zipped privately at [email protected])

That way we can investigate why this is happening: it could be either a bug or missing data source or both, but it is going to be hard to investigate short of hard data.

pombredanne avatar Oct 26 '21 09:10 pombredanne

@pombredanne I have sent a mail with packages / CVEs that have been reported by Nexus IQ, but not by VulnerableCode.

oheger-bosch avatar Oct 26 '21 12:10 oheger-bosch

I have started checking the differences and that's awesome data. @oheger-bosch Thank you +++ I will report back here when done.

pombredanne avatar Oct 29 '21 07:10 pombredanne

Thank you for looking at this so timely @pombredanne. It's quite important for us to understand what causes these differences in findings, as we're still planning for a major rollout of VulnerableCode, but this is being blocked by the concerns raised here.

sschuberth avatar Oct 29 '21 08:10 sschuberth

This might be related, but I found many CVEs without packages... To take a completely random sample, consider CVE-2020-15222

Querying the NVD CVE database we get at least one package (or CPE in this case: cpe:2.3:a:ory:fosite:*:*:*:*:*:*:*:*), e.g.

curl https://services.nvd.nist.gov/rest/json/cve/1.0/CVE-2020-15222

Checking VulnerableCode, I can find the CVE but it has zero packages: image

Is this just because purls and CPEs don't really play well together, or should the data be consistent?

jlarfors avatar Nov 18 '21 20:11 jlarfors

@jlarfors Thank you for the details. For this CVE-2020-15222 this is different bug

  1. we should have a created a package relationship for pkg:github.com/ory/fosite@0c9e0f6d654913ad57c507dd9a36631e1858a3e9 which seems to be the fix commit
  2. and pkg:github.com/ory/[email protected] which is a fixed version
  3. and pkg:github.com/ory/[email protected] which is the first affected version
  4. and pkg:github.com/ory/[email protected] which the affected version in between
  5. and a version range

@Hritik14 ^ :)

pombredanne avatar Nov 19 '21 18:11 pombredanne

@jlarfors and possibly also other package relationships to be inferred as this is a Go package based on https://github.com/ory/fosite/blob/master/go.mod

pombredanne avatar Nov 19 '21 18:11 pombredanne

We're missing the fosite vulnerability since there's no importer for go ecosystem. However this should be fixed by https://github.com/nexB/vulnerablecode/pull/578

sbs2001 avatar Dec 06 '21 05:12 sbs2001

Dear all, is there any progress to report regarding this issue?

sschuberth avatar May 06 '22 13:05 sschuberth

@sschuberth We are working on improving the data quality by VulnerableCode. A lot of it can be tracked in https://github.com/nexB/vulnerablecode/issues/597 and the other open tickets. We are also pondering over a project to compare VulnerableCode data to the rest of the world with something like VulnTotal.

Hritik14 avatar May 07 '22 11:05 Hritik14