Support mirroring through API
The users may want to mirror the OSV batch data to improve the performance on both sides. If the OSV API can provide batch data similar to the one from GHSA API, it will help with the mirroring.
Related links :
https://github.com/DependencyTrack/dependency-track/blob/master/src/main/java/org/dependencytrack/tasks/GitHubAdvisoryMirrorTask.java
https://github.com/DependencyTrack/dependency-track/blob/master/src/main/resources/templates/github/securityAdvisories.peb
hey @VinodAnandan !
Could you please clarify what you mean by mirroring batch data? Do you mean accessing a data dump of all aggregated OSV data?
There is already a way to do so here: https://github.com/google/osv#data-dumps
Hey @oliverchang, The initial batch will contain all the data at that particular point in time. The subsequent process will fetch the new/modified data.
Can we periodically fetch incremental data from the osv data-dumps? Could you please share any documentation?
Hey @oliverchang, The initial batch will contain all the data at that particular point in time. The subsequent process will fetch the new/modified data.
Can we periodically fetch incremental data from the osv data-dumps? Could you please share any documentation?
We don't have any functionality today to provide incremental data, mostly because we haven't seen a pressing need for this. The size of all vulnerability data over time should be small enough it's simpler to just bulk process all entries from scratch each time. This simplicity may significantly outweigh any potential efficiencies from a more complicated incremental setup.
Here are the current sizes across all of OSV:
> gsutil ls -lah 'gs://osv-vulnerabilities/**/all.zip'
257.89 KiB 2022-06-07T06:57:05Z gs://osv-vulnerabilities/Android/all.zip#1654585025273555 metageneration=1
11.08 KiB 2022-06-07T06:57:07Z gs://osv-vulnerabilities/DWF/all.zip#1654585027026057 metageneration=1
19.62 KiB 2022-06-07T06:57:07Z gs://osv-vulnerabilities/GSD/all.zip#1654585027622910 metageneration=1
701.51 KiB 2022-06-07T06:57:12Z gs://osv-vulnerabilities/Go/all.zip#1654585032781187 metageneration=1
11.36 KiB 2022-06-07T06:57:15Z gs://osv-vulnerabilities/Hex/all.zip#1654585035336832 metageneration=1
783 B 2022-06-07T06:57:15Z gs://osv-vulnerabilities/JavaScript/all.zip#1654585035648760 metageneration=1
9.18 MiB 2022-06-07T06:58:39Z gs://osv-vulnerabilities/Linux/all.zip#1654585119381016 metageneration=1
1.71 MiB 2022-06-07T06:59:05Z gs://osv-vulnerabilities/Maven/all.zip#1654585145838779 metageneration=1
212.78 KiB 2022-06-07T06:59:11Z gs://osv-vulnerabilities/NuGet/all.zip#1654585151540424 metageneration=1
1.62 MiB 2022-06-07T06:59:25Z gs://osv-vulnerabilities/OSS-Fuzz/all.zip#1654585165363352 metageneration=1
911.27 KiB 2022-06-07T06:59:39Z gs://osv-vulnerabilities/Packagist/all.zip#1654585179108957 metageneration=1
3.58 MiB 2022-06-07T07:00:05Z gs://osv-vulnerabilities/PyPI/all.zip#1654585205754672 metageneration=1
560.23 KiB 2022-06-07T07:00:19Z gs://osv-vulnerabilities/RubyGems/all.zip#1654585219361574 metageneration=1
22 B 2022-06-07T07:00:21Z gs://osv-vulnerabilities/UVI/all.zip#1654585220970207 metageneration=1
721.16 KiB 2022-06-07T07:00:25Z gs://osv-vulnerabilities/crates.io/all.zip#1654585225815500 metageneration=1
2.28 MiB 2022-06-07T07:00:41Z gs://osv-vulnerabilities/npm/all.zip#1654585241801811 metageneration=1
Will this cause issues?
Thanks @oliverchang, we will be using the full download as an interim solution. But I think the incremental update will enable small and more frequent downloads of the database.