trivy
trivy copied to clipboard
fix(java): use hash of GAV+root pom file path for pkgID for packages from pom.xml files
Description
This PR refactors how POM (Maven) package IDs are generated in Trivy to address issues with duplicate dependencies across multi-module Maven projects. Previously, package IDs were based solely on GAV (GroupId:ArtifactId:Version), which caused collisions when the same dependency appeared in different modules or root projects. The new approach generates hash-based IDs that incorporate both the GAV and the root POM file path, ensuring unique identification of packages across different contexts.
Changes
Core Changes
- Modified packageID() function (pkg/dependency/parser/java/pom/parse.go:896): Now generates a hash-based ID combining GAV and POM file path instead of using GAV alone
- Added RootFilePath field to artifact struct (pkg/dependency/parser/java/pom/artifact.go:36-38): Tracks the root or module POM file path for each artifact
- Updated cache key generation (pkg/dependency/parser/java/pom/cache.go:20): Incorporates RootFilePath to ensure proper cache isolation between modules
- Enhanced analysisOptions (pkg/dependency/parser/java/pom/parse.go:378-381): Added rootFilePath field to propagate file path information through the analysis pipeline
ID Generation
- Package IDs changed from human-readable format (e.g., com.example:log4shell:1.0-SNAPSHOT) to hash-based format (e.g., b21b31f8c0d5705a)
- Hash is calculated using hashstructure.Hash() with the GAV and file path as inputs
- Ensures deterministic, collision-free package identification
Test Updates
- Updated all test expectations to use hash-based IDs instead of GAV strings
- Modified test helper functions to accept ID parameters
- Updated golden files for integration tests (CycloneDX, JSON output)
Benefits
- Eliminates ID collisions: Different modules can have the same dependency without causing conflicts in the dependency graph
- Accurate vulnerability tracking: Vulnerabilities are now correctly associated with packages in their specific module context
- Improved multi-module support: Better handling of complex Maven projects with multiple modules and shared dependencies
- Consistent dependency trees: Dependency relationships are now correctly tracked per module rather than globally
Related issues
- Close #7824
Related PRs
- [ ] #7879
Checklist
- [x] I've read the guidelines for contributing to this repository.
- [x] I've followed the conventions in the PR title.
- [x] I've added tests that prove my fix is effective or that my feature works.
- [ ] I've updated the documentation with the relevant information (if needed).
- [ ] I've added usage information (if the PR introduces new options)
- [ ] I've included a "before" and "after" example to the description (if the PR is a user interface change).
@knqyf263 I created PR with your idea (https://github.com/aquasecurity/trivy/pull/7879#issuecomment-3574120751) take a look, when you have time, please