Dubious product names from GAD source
Description
gad_source.py always take the last part of the product slug as the product name. This gives products names such as lib or v3 because vulnerabilities are under folders like
- gad/gemnasium-db-master-go/go/go.etcd.io/etcd/client/v3
- gad/gemnasium-db-master-go/go/go.mozilla.org/sops/v3
- gad/gemnasium-db-master-go/go/github.com/cloudflare/cfrpki/sync/lib
To reproduce
Steps to reproduce the behaviour:
- create/update database with at least GAD source
Expected behaviour:
- consistent product names:
etcd,sops,cfrpki(although the last one is actually Octorpki, only found in the vulnerablity description, which would require a bit too much magic to figure out without a language model) - vendor name could be found too, but that's much less obvious, so
unknownis fine I guess.
Actual behaviour:
- name = last part of slug, sometimes incorrect as in above examples
Version/platform info
Version of CVE-bin-tool: 3.4.1 (main branch, commit d5f7cf49367f90e4a71ae2461f5af7b70330bd3e, 2025-09-04)
Installed from pypi or github? github
Operating system: Linux 5.15.167.4-microsoft-standard-WSL2 (Ubuntu 24.04)
Python version: 3.12.3
Running in any particular CI environment we should know about? no
Anything else?
GAD might require case-based parsing. I don't see much lightweight one-size-fits-all solution here (as opposed to a heavy and overkill LLM-based parsing)
Thanks for the report! I'm... honestly not sure what the right solution is here either. If GAD has purl maybe we should just skip the product slug entirely.
Opened PR #5394 to address this issue. The PR replaces the parts[-1] logic with a helper function to derive vendor/product from GAD slugs (strip /vN, ignore lib/client, etc.) and adds unit tests for the examples mentioned here.