BUG: Package mapping to PurlDB fails despite exact match existing
Describe the bug DejaCode v5.4.0 fails to map packages to their corresponding PurlDB entries, despite these entries in PurlDB existing and even having an exact match for the same PURL. This manifests itself with the PurlDB tab being greyed-out for the package and "Improve Packages from PurlDB" not finding any data to import.
For instance the packages pkg:maven/com.fasterxml.jackson.core/[email protected]?type=jar has two related entries in the PurlDB one for pkg:maven/com.fasterxml.jackson.core/[email protected]?classifier=sources&type=jar and one for pkg:maven/com.fasterxml.jackson.core/[email protected]?type=jar. The latter should be an exact match.
These issues may be related to changes made for #307
To Reproduce Steps to reproduce the behavior:
- Import an SBOM, we tested this with maven packages
- Run
load_sbomandpopulate_purldbpipeline in ScanCode.io - Manually verify in DejaCode that entries for the package exist in PurlDB
- Check that the PurlDB tab is greyed-out for the package anyway
Expected behavior DejaCode should be able to establish a mapping between packages and PurlDB entries, especially if an exakt match with qualifiers exists. If there is some conflict regarding multiple entries with different qualifiers existing and potentially applying, then such a conflict needs to be resolved
Screenshots
Context (OS, Browser, Device, etc.): n.a.
This is a bug in commit https://github.com/aboutcode-org/dejacode/commit/6fec5579180d39b3ce2c4bf3dd8b15d60d0bc780
The root cause is that the qualifier is only removed from the package's PURL but not from PurlDB's PURL. Since PurlDB apparently stores the qualifiers in the PURL as well, as shown in screenshots above, the qualifier has to be removed prior to the comparison or no matches will be found.
https://github.com/aboutcode-org/dejacode/blob/e80db0eaeffdab150a834413b1f14360a46d3c0c/component_catalog/models.py#L2566-L2571
The fix would be:
if package_url:
purldb_entries = [
entry
for entry in purldb_entries
if get_plain_purl(entry.get("purl", "")) == get_plain_purl(package_url)
]