DependencyCheck icon indicating copy to clipboard operation
DependencyCheck copied to clipboard

Getting started with mysql storage, batch processing is taking forever

Open kelfink opened this issue 9 months ago • 2 comments

I've set up the NVDAPI key in our projects and that helps a lot, but wanted to incorporate the database storage for caching. I'm using a MariaDB backend, and found that specifying "character set utf8" in initialize_mysql.sql when creating the database, was needed to avoid DB errors when importing the NVD data. Now that I'm past that, the "Downloaded ..." logs completed but now I see

[INFO] Completed processing batch 1/125 (1%) in 992,366ms [INFO] Completed processing batch 2/125 (2%) in 1,784,160ms ... [INFO] Completed processing batch 14/125 (11%) in 2,254,533ms [INFO] Completed processing batch 15/125 (12%) in 2,280,112ms [INFO] Completed processing batch 16/125 (13%) in 2,454,196ms

Here is seems to have gotten stuck. I haven't seen process 17/125.

which seems like an incredibly long time to wait. Is this the database taking a long time? My slow query log doesn't show any queries over 1.5 seconds, so it seems not.... I'm not even sure what 'Completed processing batch' means. Help?

kelfink avatar May 06 '24 15:05 kelfink

Seeing a similar behavior over here against Aurora for PostgreSQL here and owasp/dependency-check:10.0.3 Process running for over 2hs now and still a good way to go...

2024-08-02T17:38:48+02:00 [INFO] Downloaded 258,861/258,861 (100%)
2024-08-02T17:46:59+02:00 [INFO] Completed processing batch 1/130 (1%) in 656,293ms
2024-08-02T17:52:54+02:00 [INFO] Completed processing batch 2/130 (2%) in 1,009,181ms
2024-08-02T17:52:58+02:00 [INFO] Completed processing batch 3/130 (2%) in 1,012,495ms
2024-08-02T17:53:44+02:00 [INFO] Completed processing batch 4/130 (3%) in 1,059,200ms
(....)
2024-08-02T19:34:21+02:00 [INFO] Completed processing batch 98/130 (75%) in 6,976,488ms
2024-08-02T19:36:53+02:00 [INFO] Completed processing batch 99/130 (76%) in 7,128,100ms
2024-08-02T19:36:53+02:00 [INFO] Completed processing batch 100/130 (77%) in 7,122,805ms

UPDATE: Doubled the compute resources and it finished the last bit in just under 10min:

2024-08-02T19:44:13+02:00 [INFO] Completed processing batch 129/130 (99%) in 7,443,659ms
2024-08-02T19:44:13+02:00 [INFO] Completed processing batch 130/130 (100%) in 7,437,527ms
2024-08-02T19:44:14+02:00 [INFO] Skipping Known Exploited Vulnerabilities update check since last check was within 24 hours.
2024-08-02T19:44:14+02:00 [INFO] Check for updates complete (7692806 ms)`

emboss64 avatar Aug 02 '24 17:08 emboss64

pretty sure that part is a bit memory-intensive. The benefit of using a DB like this is that once you have the first full update of data it is very quick to do daily updates and keep it current.

jeremylong avatar Aug 03 '24 14:08 jeremylong