csaf_distribution icon indicating copy to clipboard operation
csaf_distribution copied to clipboard

[Feature] Resume download from flaky servers

Open Rafiot opened this issue 3 months ago • 7 comments

This is still related to vulnerability-lookup, and the problem is as follows:

  • CSAF repos are moderately reliable: some files are missing, the server fails to serve some files for some time, flaky connections, ...
  • The feeders on vulnerability lookup have 2 modes: initial import (fetch everything without a --time_range), update (with a timerange)
  • In order to go from the initial import to the update mode, I need to make sure I downloaded everything I could up to a specific time

Right now, when the downloader finishes, I check the logs and if total_failed is not 0, I assume the fetching was incomplete, and my next run will still be an initial import.

Some servers will have files that are always failing, so I add them in --ignore_pattern until have a clean initial import.

Problem is that the downloader will try to download every single file every time it runs, so unless I get very lucky and all the files are properly downloaded in a single execution, I never leave the initial import mode.

Do you have any idea how to solve that problem? I was thinking maybe adding skip-existing-files option where we compute the hash of the local file against the remote hash, and skip it is they're the same (just so in case the local file has been updated between the two runs of the downloader, we re-fetch it).

Rafiot avatar Aug 14 '25 12:08 Rafiot