dependency-track icon indicating copy to clipboard operation
dependency-track copied to clipboard

Trivy Analyzer Performs Excessive Repeated DB Queries During Vulnerability Ingestion

Open arjavdongaonkar opened this issue 1 month ago • 3 comments

Current Behavior

When ingesting vulnerabilities reported by Trivy, the current analyser processes data component-by-component, performing database operations for each vulnerability. This design results in a very high number of repeated queries and writes, especially during large scans.

To illustrate the scale:

Example Scenario

If Trivy reports ~3,000 vulnerabilities spread across ~200 components, the current implementation performs:

Per Vulnerability

  • 1× DB lookup (getVulnerabilityByVulnId)
  • 1× DB write to add the vulnerability-to-component association
  • Optional DB write if severity differs (e.g., ~10% cases)

Per Component

  • 1× DB lookup (getObjectByUuid(Component))

Estimated Total DB Activity (Old Behavior)

Operation Type Count
Vulnerability existence lookups ≈ 3,000 queries
Vulnerability-to-component writes ≈ 3,000 writes
Severity updates (assume ~10%) ≈ 300 writes
Component lookups ≈ 200 queries
Total DB interactions ≈ 6,500+ operations

Impact

  • High number of database round-trip calls
  • Significant transaction overhead (one transaction per component)
  • Poor scalability as vulnerability counts increase
  • Increased processing time for large Trivy scans
  • Unnecessary load on the persistence layer

Summary

The current ingestion mechanism does not scale well because it individually processes every vulnerability and repeatedly performs identical DB operations. This becomes especially problematic with scans containing thousands of vulnerabilities.

Steps to Reproduce

  1. Configure Dependency-Track with the Trivy analyzer enabled.
  2. Scan a project that contains a large number of dependencies (e.g., 150–300 components).
  3. Ensure Trivy reports a large vulnerability set (e.g., ~3,000 vulnerabilities).
  4. Observe the ingestion process in the logs while the Trivy findings are being processed.
  5. Monitor database query volume and ingestion time during this step.

Expected Behavior

  • Vulnerability ingestion should minimize redundant database queries.
  • Components should not be re-fetched for every vulnerability.
  • Vulnerability existence checks should not be executed thousands of times individually.
  • The system should scale consistently even when Trivy reports thousands of vulnerabilities.
  • Database load should remain stable and proportional to actual unique operations, not total vulnerabilities.

Dependency-Track Version

4.13.5

Dependency-Track Distribution

Container Image

Database Server

PostgreSQL

Database Server Version

No response

Browser

Google Chrome

Checklist

arjavdongaonkar avatar Nov 05 '25 11:11 arjavdongaonkar

Created a PR for batch optimisations: https://github.com/DependencyTrack/dependency-track/pull/5498

arjavdongaonkar avatar Nov 05 '25 11:11 arjavdongaonkar

What do we need to do to get this merged? I guess we're looking at a lack of resources with the main developers to handle of of the issue tickets and review the PRs. How can we support to improve this? (My Java-Foo and knowledge of the dt code sadly isn't good enough to judge/perform a really high level review)

savek-cc avatar Dec 04 '25 08:12 savek-cc

@savek-cc Testing helps more than anything else tbh. Code review is one thing, but making sure stuff isn't breaking for real workloads is the main chunk of work for reviews. And unfortunately quite time consuming.

I'll have a look at the associated PR today.

nscuro avatar Dec 04 '25 11:12 nscuro