claircore icon indicating copy to clipboard operation
claircore copied to clipboard

False positives for multi-stream repositories in the RHCC ecosystem

Open jasinner opened this issue 2 years ago • 1 comments

Sometimes later versions of containers in a repository can use versions which are lower than previous versions when compared using RPM schematics. For example the container repository registry.redhat.io/openshift-logging/eventrouter-rhel8 uses floating tags such as v5.0 for the cpe:/a:redhat:logging:5.0::el8 product. Now it uses tags such as v0.4 for the cpe:/a:redhat:logging:5.6::el8 product. Therefore we cannot do version comparisons for containers in repositories like this, otherwise later containers from later versions will show false positive results for vulnerabilities in earlier versions, such as those for CVE-2021-3121.

Red Hat plan to publish an additional datasource dubbed container-cpe-list (see SECDATA-386) for container repositories such as registry.redhat.io/openshift-logging/eventrouter-rhel8 which:

  • identify them as multi-stream repositories (meaning they have more than one floating tag, like v5.0, and v0.4)
  • specify a CPE for each floating tag

Here's an brief example of the kind of data we can expect in the container-cpe-list.json file:

[
   “openshift-logging/eventrouter-rhel8” : {“v0.4”: “cpe:/a:redhat:logging:5.6::el8”, "v5.0": "cpe:/a:redhat:logging:5.0::el8"},
   “quay/quay-bridge-operator-bundle”: {“latest”:”cpe:/a:redhat:quay:3::el8”}
}

When the scanner creates a package for a container image it can check the container-cpe-list and add the CPE to this package if it matches the repository name and floating tag given in that file. When matching those packages to fixed-in versions it should only match when a vulnerability has a fixed-in version with that CPE.

jasinner avatar Oct 27 '23 06:10 jasinner

Why not mandate a uniform tag policy that specifies RPM semantics? Or add this information into the layer itself? Or use CPEs correctly?

Additional state needed to interpret layers sucks as a solution, as a rule. It means latency between a container being available and a given Clair instance getting the new data describing that layer is a window for permanently "incorrect"[^1] results to be persisted into the database. Applying scotch tape to a broken production pipeline is a never-ending process.

[^1]: The result is not incorrect by claircore's model. Even if we wanted to incorporate some indexer state into the per-layer "done" consideration, it's unclear how to do that without any change in that indexer state requiring re-examining every layer.

hdonnay avatar Nov 02 '23 21:11 hdonnay

Updated the title to reflect the issue better, and closing to indicate we won't be doing this.

hdonnay avatar Jun 25 '24 17:06 hdonnay