RFC: Explicit reporting of negative matches
What would you like to be added?
Grype should be able to report "negative matches," that is, package/vulnerability pairs where grype found evidence that the package is not affected by the vulnerability.
Right now, grype output consists of positive matches, that is, package/vulnerability pairs where the package is believed to be affected by the vulnerability. However, when matching is performed, that is, when grype asks whether a given package is vulnerable to a given vulnerability, there are 3 possible outcomes, and grype currently loses the distinction between the second and third:
- The package is vulnerable -> this yields a match
- There is a search miss, for example, the grype database has no record for that package or its upstream -> this does not yield a match
- There is a search hit, but grype concludes the package is not vulnerable. For example grype's database has a version constraint that excludes the version of the package being considered -> proposed: this yields a "negative match"
In other words, this proposal should enable users to distinguish between two situations that are currently lumped together: a search miss, where grype didn't match because there wasn't data in the database indicating a match, and a negative match, where there is evidence in the database that the package is not vulnerable.
In this proposal, grype can optionally display negative matches.
Why is this needed?
Why would users of grype want to distinguish between these two cases? Here are a few that come to mind:
More confidence in the results
Grype being able to say, "yes, package Foo is affected by CVE-123, but you have version 1.2.4 which is patched," may give users more confidence that they are not affected by CVE-123 than simply the absence of CVE-123 in the output.
Ability to investigate true negatives
Right now, in the case of matches, grype can output match details, that is, information about what artifact was matched, how it was found, why it's believed to be vulnerable. In the case of a negative match, where for example grype matches the package, but then rejects the match based on a version constraint, all of this information is available, but it is dropped in the middle of functions, so that users and even library consumers of grype can't access it.
Additionally, grype explain could be enhanced to explain negative matches.
Users can report on patching improvements directly
When users undertake the effort to patch dependencies in their project and upgrade their base images, rather than simply seeing grype's output shrink, they could see the set of CVEs reported in negative matches grow.
Possibility of implementing namespace precedence
In many cases, a distro has more specific vulnerability information than, for example, GHSA or NVD. Being able to encode this precedence provides a good way to implement https://github.com/anchore/grype/issues/1373, and there are likely other business logic enhancements that could be made from surfacing negative matches.
User experience
- Negative matches will be made available in the context where grype reports are executed, so that users can write custom templates that use negative matches
- Negative matches will be omitted from the JSON by default, but can be included based on a config. If that config is supplied, they will be in a separate top level key.
- Negative matches will be omitted from the table output by default, but can be included based on a config. If that config is supplied, they'll be included in a separate section.
Implementation
Mostly TBD. If NegativeMatch were implemented as its own struct, that would give us 3 types of match structs (the other two are Match and IgnoredMatch). It might be preferable to make the Match struct able to represent why something didn't match, or look for another solution that doesn't involve having 3 nearly identical structs that represent related concepts.
FAQ
What's the difference between a negative match and an ignored match?
An ignored match is a positive match that a user chose to suppress. A negative match is a package/vulnerability combination for which grype is stating the package is likely not affected by the vulnerability.
Why is it called a "negative match"? Can we call it something else?
It's called a negative match because it's the opposite of a match. A regular match (a "positive" match) asserts that a package is vulnerable to a vulnerability; a negative match asserts that it isn't. We also considered the term "AntiMatch" and "PartialMatch." We can certainly change the name if saying things like, "This false positive should really be a negative match" is too confusing, but it's the best name I have right now.
Additional context:
The idea is first proposed under the name "anti-match" (instead of "negative match") at https://github.com/anchore/grype/issues/1373#issuecomment-1673475461
I really like this idea @willmurphyscode! Thanks for writing it up in such great detail.
I don't think it would be too difficult to make the necessary changes in vunnel to surface the not affected entries as version_constraint: < 0 for the v5 schema for now. That should allow someone to start working on this in grype at any point and shouldn't negatively affect the existing results. It would also improve the rhel derivative results in the interim
@westonsteimel Thanks!
For version_constraint: < 0, what should the fix_state be? All the 4 we have right now (unknown, fixed, not-fixed, and wont-fix) seem not quite applicable to the situation. Do we need a schema version change to add not-affected values to this column?
We already handle alpine data this way, the state is currently just set to fixed
select * from vulnerability where version_constraint = '< 0' limit 3;
pk id package_name namespace package_qualifiers version_constraint version_format cpes related_vulnerabilities fixed_in_versions fix_state advisories
------ -------------- ------------ ------------------------- ------------------ ------------------ -------------- ---- ----------------------------------------------- ----------------- --------- ----------
236893 CVE-2020-14342 cifs-utils alpine:distro:alpine:3.10 < 0 apk [{"id":"CVE-2020-14342","namespace":"nvd:cpe"}] ["0"] fixed
237037 CVE-2020-27569 openvpn alpine:distro:alpine:3.10 < 0 apk [{"id":"CVE-2020-27569","namespace":"nvd:cpe"}] ["0"] fixed
237077 CVE-2020-36325 jansson alpine:distro:alpine:3.10 < 0 apk [{"id":"CVE-2020-36325","namespace":"nvd:cpe"}] ["0"] fixed
I don't think it would need to be a breaking schema change to add a new string value to that column though, but I'm not certain. From a db perspective it shouldn't break anything, the question is whether previous versions of grype reading the db will break
That will likely just take testing, but we're definitely not ready for a schema break if that is required here
@willmurphyscode , I just tried locally setting fix_state in the db to a value of not-affected and grype didn't seem to mind, so I think there's a decent chance that adding a new value won't need to be a breaking change
how far in the past would these negative matches go? lets say I use a dependency in Version 2.0.0 but version 1.0.0 had a cve that was critical would it still be reported?
@henrysachs That's a great question, thanks!
I can see how packages with a lot of major versions, you could see negative matches from 10 years ago, and that would be super noisy.
What do you think should happen in the case that there are tons of negative matches for a particular package? Maybe only negative matches that were ruled out by minor or patch versions? Maybe only the N most recent negative matches?
So I had a few days to think about this and I think least N findings would be nice (I think about like 5), but I also thought about how this is complementary to the vex format. I know this is probably more for vulnerabilities where no patches are available but maybe this could play together?
but I also thought about how this is complementary to the vex format. I know this is probably more for vulnerabilities where no patches are available but maybe this could play together?
@henrysachs can you help me understand what you mean here? I think I don't know enough about Vex to understand this comment.