trivy icon indicating copy to clipboard operation
trivy copied to clipboard

fix(java): use hash of GAV+root pom file path for pkgID for packages from pom.xml files

Open DmitriyLewen opened this issue 4 weeks ago • 1 comments

Description

This PR refactors how POM (Maven) package IDs are generated in Trivy to address issues with duplicate dependencies across multi-module Maven projects. Previously, package IDs were based solely on GAV (GroupId:ArtifactId:Version), which caused collisions when the same dependency appeared in different modules or root projects. The new approach generates hash-based IDs that incorporate both the GAV and the root POM file path, ensuring unique identification of packages across different contexts.

Changes

Core Changes

  • Modified packageID() function (pkg/dependency/parser/java/pom/parse.go:896): Now generates a hash-based ID combining GAV and POM file path instead of using GAV alone
  • Added RootFilePath field to artifact struct (pkg/dependency/parser/java/pom/artifact.go:36-38): Tracks the root or module POM file path for each artifact
  • Updated cache key generation (pkg/dependency/parser/java/pom/cache.go:20): Incorporates RootFilePath to ensure proper cache isolation between modules
  • Enhanced analysisOptions (pkg/dependency/parser/java/pom/parse.go:378-381): Added rootFilePath field to propagate file path information through the analysis pipeline

ID Generation

  • Package IDs changed from human-readable format (e.g., com.example:log4shell:1.0-SNAPSHOT) to hash-based format (e.g., b21b31f8c0d5705a)
  • Hash is calculated using hashstructure.Hash() with the GAV and file path as inputs
  • Ensures deterministic, collision-free package identification

Test Updates

  • Updated all test expectations to use hash-based IDs instead of GAV strings
  • Modified test helper functions to accept ID parameters
  • Updated golden files for integration tests (CycloneDX, JSON output)

Benefits

  1. Eliminates ID collisions: Different modules can have the same dependency without causing conflicts in the dependency graph
  2. Accurate vulnerability tracking: Vulnerabilities are now correctly associated with packages in their specific module context
  3. Improved multi-module support: Better handling of complex Maven projects with multiple modules and shared dependencies
  4. Consistent dependency trees: Dependency relationships are now correctly tracked per module rather than globally

Related issues

  • Close #7824

Related PRs

  • [ ] #7879

Checklist

  • [x] I've read the guidelines for contributing to this repository.
  • [x] I've followed the conventions in the PR title.
  • [x] I've added tests that prove my fix is effective or that my feature works.
  • [ ] I've updated the documentation with the relevant information (if needed).
  • [ ] I've added usage information (if the PR introduces new options)
  • [ ] I've included a "before" and "after" example to the description (if the PR is a user interface change).

DmitriyLewen avatar Dec 04 '25 12:12 DmitriyLewen

@knqyf263 I created PR with your idea (https://github.com/aquasecurity/trivy/pull/7879#issuecomment-3574120751) take a look, when you have time, please

DmitriyLewen avatar Dec 04 '25 12:12 DmitriyLewen