vulnerablecode icon indicating copy to clipboard operation
vulnerablecode copied to clipboard

NVD Importer failing due to NVD Data Feed API changes

Open jarek-o opened this issue 2 months ago • 4 comments

Vulnerable Code version: 36.1.3, v36.1.1 and possibly all others System: OpenShift Hi team,

I have noticed recently that our NVD importer is not working anymore, it gives me this error:

Importing data using nvd_importer
INFO 2025-10-28 13:08:24.045287 UTC Pipeline [NVDImporterPipeline] starting
INFO 2025-10-28 13:08:24.045647 UTC Step [collect_and_store_advisories] starting
INFO 2025-10-28 13:08:25.131858 UTC Collecting 315,971 advisories
INFO 2025-10-28 13:08:25.132113 UTC Fetching `https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-1.1-2002.json.gz`
INFO 2025-10-28 13:08:26.250632 UTC Pipeline failed
INFO 2025-10-28 13:08:26.250869 UTC Running [on_failure] tasks
INFO 2025-10-28 13:08:26.250972 UTC Completed [on_failure] tasks in 0 seconds
Not a gzipped file (b'<!')

Traceback:
  File "/app/vulnerabilities/pipelines/__init__.py", line 106, in execute
    step(self)
  File "/app/vulnerabilities/pipelines/__init__.py", line 209, in collect_and_store_advisories
    for advisory in progress.iter(self.collect_advisories()):
  File "/usr/local/lib/python3.9/site-packages/aboutcode/pipeline/__init__.py", line 314, in iter
    for item in iterator:
  File "/app/vulnerabilities/pipelines/nvd_importer.py", line 97, in collect_advisories
    for _year, cve_data in fetch_cve_data_1_1(logger=self.log):
  File "/app/vulnerabilities/pipelines/nvd_importer.py", line 119, in fetch_cve_data_1_1
    yield year, fetch(url=download_url, logger=logger)
  File "/app/vulnerabilities/pipelines/nvd_importer.py", line 106, in fetch
    data = gzip.decompress(gz_file.content)
  File "/usr/local/lib/python3.9/gzip.py", line 556, in decompress
    return f.read()
  File "/usr/local/lib/python3.9/gzip.py", line 300, in read
    return self._buffer.read(size)
  File "/usr/local/lib/python3.9/gzip.py", line 487, in read
    if not self._read_gzip_header():
  File "/usr/local/lib/python3.9/gzip.py", line 435, in _read_gzip_header
    raise BadGzipFile('Not a gzipped file (%r)' % magic)
CommandError: 1 failed!: nvd_importer

I have tried accessing this file through curl from local machine and from server, and the only thing that it is stored in this file is this:

<!DOCTYPE html>
<!--[if lt IE 7]> <html class="no-js ie6 oldie" lang="en-US"> <![endif]-->
<!--[if IE 7]>    <html class="no-js ie7 oldie" lang="en-US"> <![endif]-->
<!--[if IE 8]>    <html class="no-js ie8 oldie" lang="en-US"> <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en-US"> <!--<![endif]-->
<head>
<title>Attention Required! | Cloudflare</title>
<meta charset="UTF-8" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=Edge" />
<meta name="robots" content="noindex, nofollow" />
<meta name="viewport" content="width=device-width,initial-scale=1" />
<link rel="stylesheet" id="cf_styles-css" href="/cdn-cgi/styles/cf.errors.css" />
<!--[if lt IE 9]><link rel="stylesheet" id='cf_styles-ie-css' href="/cdn-cgi/styles/cf.errors.ie.css" /><![endif]-->
<style>body{margin:0;padding:0}</style>

Which is the website of Cloudflare that is why at the begining of error we have: " Not a gzipped file (b'<!') "

As NIST (NVD) is deprecating their download feed function and heavily limit rates at which files could be downloaded via nvd .

gist:

Rate Limits
NIST firewall rules put in place to prevent denial of service attacks can thwart your application if it exceeds a predetermined rate limit.
The public rate limit (without an API key) is 5 requests in a rolling 30 second window; the rate limit with an API key is 50 requests in a rolling 30 second window. 
Requesting an API key significantly raises the number of requests that can be made in a given time frame. 
However, it is still recommended that your application sleeps for several seconds between requests so that legitimate requests are not denied, and all requests are responded to in sequence.

Are there any plans for moving NVD importer to API as this is the recommended way to obtain vulnerabilities and has higher threshold of API calls (after obtaining token).

For now this functionality does not work and it might eventually stop working all together.

jarek-o avatar Oct 29 '25 13:10 jarek-o

@jarek-o It sounds like NVD has dropped version 1.1 of the Data Feed API and is now using version 2.0.

Decommissioning of Legacy Data Feed Files

As of August 20th, 2025, the following legacy Data Feed files have been removed from the NVD [Data Feeds Page](https://nvd.nist.gov/vuln/data-feeds) and are no longer available for access or download

https://www.nist.gov/itl/nvd

https://nvd.nist.gov/feeds/json/cve/1.1/nvdcve-2.0-2002.json.gz (returns a blocked page) https://nvd.nist.gov/feeds/json/cve/2.0/nvdcve-2.0-2002.json.gz (this URL works)

I’ll create a quick PR for this after testing.

ziadhany avatar Oct 29 '25 14:10 ziadhany

can you assign this task to me?

jenamjain avatar Oct 29 '25 18:10 jenamjain

can you assign this task to me?

@jenamjain Sorry, but I'm already working on it. Feel free to pick any other issue that doesn't have a PR.

ziadhany avatar Oct 29 '25 19:10 ziadhany

do we have a rough ETA on when this will be implemented?

jarek-o avatar Oct 30 '25 12:10 jarek-o