osv-scanner icon indicating copy to clipboard operation
osv-scanner copied to clipboard

Unable to flag an issue on org.apache.kafka:kafka_2.13:3.7.1

Open ketan opened this issue 8 months ago • 4 comments

My environment:

% ~/.gradle/osv-scanner -v                                      
osv-scanner version: 2.0.1
commit: be9015f3256940e63d99b9d1a009f99c7dc4d8ec
built at: 2025-04-03T02:12:38Z

I'm using org.apache.kafka:kafka_2.13:3.7.1. This package is known to have a known vulnerability - https://osv.dev/vulnerability/GHSA-p7c9-8xx8-h74f, which is not being flagged by osv-scanner.

1 % ~/.gradle/osv-scanner scan source --sbom build/reports/bom.json 
Scanned /private/tmp/osv-scanner-bug/build/reports/bom.json file and found 55 packages
╭─────────────────────────────────────┬──────┬───────────┬────────────────────────┬───────────────┬────────────────────────╮
│ OSV URL                             │ CVSS │ ECOSYSTEM │ PACKAGE                │ VERSION       │ SOURCE                 │
├─────────────────────────────────────┼──────┼───────────┼────────────────────────┼───────────────┼────────────────────────┤
│ https://osv.dev/GHSA-78wr-2p64-hpwj │ 8.7  │ Maven     │ commons-io:commons-io  │ 2.11.0        │ build/reports/bom.json │
│ https://osv.dev/GHSA-389x-839f-4rhx │ 5.5  │ Maven     │ io.netty:netty-common  │ 4.1.105.Final │ build/reports/bom.json │
│ https://osv.dev/GHSA-xq3w-v528-46rv │ 5.5  │ Maven     │ io.netty:netty-common  │ 4.1.105.Final │ build/reports/bom.json │
│ https://osv.dev/GHSA-4g8c-wm8x-jfhw │ 7.5  │ Maven     │ io.netty:netty-handler │ 4.1.105.Final │ build/reports/bom.json │
╰─────────────────────────────────────┴──────┴───────────┴────────────────────────┴───────────────┴────────────────────────╯

The generated SBOM bom.json contains the following bomref:

      "type" : "library",
      "bom-ref" : "pkg:maven/org.apache.kafka/[email protected]?type=jar",
      "group" : "org.apache.kafka",
      "name" : "kafka_2.13",
      "version" : "3.7.1",

However, it's unclear why osv-scanner is unable to flag this. My suspicion was that it had something to do with the scala specific _2.13 suffix on the name. I tried replacing the _2.13 suffix in the SBOM, but still having trouble with this dependency being flagged.

From https://www.scala-sbt.org/1.x/docs/Cross-Build.html#Publishing+conventions

The underlying mechanism used to indicate which version of Scala a library was compiled against is to append _ to the library’s name. For example, the artifact name dispatch-core_2.12 is used when compiled against Scala 2.12.0, 2.12.1 or any 2.12.x version. This fairly simple approach allows interoperability with users of Maven, Ant and other build tools.

ketan avatar Apr 03 '25 04:04 ketan

It does seem to be because of _2.13, after removing this, it was able to identify the known vuln. (Need to remove it from the PURL, not the package name)

Thanks for linking to the publishing conventions, this is a bug we would like to fix, though currently I'm not sure whether the suffix should be stripped on the API side or on the extraction side.

@oliverchang @cuixq thoughts?

another-rex avatar Apr 04 '25 02:04 another-rex

It seems incorrect to strip this on the extraction side to me, because kafka_2.13 is a legitimate package that exists on Maven Central.

Either we need to fix this on the data side (GHSA), or we need to fix this on the matching side. Is missing results due to this convention a prevalent issue in the Maven ecosystem?

Having this be part of matching behaviour does seem a bit hacky, and we'd need to change this in 2 places:

  • OSV.dev
  • Local matching

If it's a prevalent issue, then realistically we may need to handle this in the matching side.

oliverchang avatar Apr 04 '25 02:04 oliverchang

Is missing results due to this convention a prevalent issue in the Maven ecosystem?

If it's a prevalent issue, then realistically we may need to handle this in the matching side.

This should answer - https://mvnrepository.com/open-source/jvm-languages - 2nd most popular JVM language, after kotlin.

Quite a few libraries in the big data world that are published to maven are built with (and for) scala/sbt. This would be prevalent issue, IMO. Some examples that come to the top of my mid - kafka (as in this example) akka, play, apache spark (the engine behind databricks)

ketan avatar Apr 04 '25 04:04 ketan

I think we should fix the data instead of striping the suffix since the same package with different suffices, for example org.apache.kafka:kafka_2.12 and org.apache.kafka:kafka_2.13 are considered as different packages and Maven is not able to identify the package without the suffix (org.apache.kafka:kafka).

I do notice there are vulnerabilities with the suffix in the package name: https://github.com/advisories/GHSA-3j6g-hxx5-3q26, so having the affected package with the suffix is not a "no" thing, I guess we could ask the data source to fix this record, i.e. appending the suffix to the affected package name.

cuixq avatar Apr 07 '25 00:04 cuixq

Update on this, we spoke with the GHSA team, and they are aware and working on updating the records, though I don't have a timeline on when this will be complete.

another-rex avatar Apr 30 '25 04:04 another-rex

This issue has not had any activity for 60 days and will be automatically closed in two weeks

See https://github.com/google/osv-scanner/blob/main/CONTRIBUTING.md for how to contribute a PR if you're interested in helping out.

github-actions[bot] avatar Jun 29 '25 05:06 github-actions[bot]

I've filed https://github.com/github/advisory-database/issues/5781 to track this issue on the data side.

oliverchang avatar Jun 30 '25 23:06 oliverchang