cve-bin-tool
cve-bin-tool copied to clipboard
fix: Timeout Error with CVE-Bin-Tool Database Update/Download
Description
I'm experiencing a timeout error when updating/downloading the database using cve-bin-tool --nvd api2
, even with the nvd_key
option and --nvd json-mirror
. This issue persists across different networks and has prevented me from using the tool effectively.
Also I tried --disable-data-source EPSS
option, but still getting the same timed out error.
To reproduce
Steps to reproduce the behaviour:
- Run this command: cve-bin-tool -l debug --nvd json-mirror -u now --disable-data-source EPSS Expected behaviour: No time out error. Actual behaviour:
x: ~>cve-bin-tool --nvd json-mirror -u now --disable-data-source EPSS
[06:57:24] INFO cve_bin_tool - CVE Binary Tool v3.3.1dev0 cli.py:571
INFO cve_bin_tool - This product uses the NVD API but is not endorsed or certified by the NVD. cli.py:572
INFO cve_bin_tool - Disabling data source EPSS cli.py:693
[06:57:55] WARNING cve_bin_tool.CVEDB - Updating cachedir /home/tvyavaha/.cache/cve-bin-tool cvedb.py:665
[06:58:03] INFO cve_bin_tool - You are running version 3.3.1dev0 of cve-bin-tool but the latest PyPI Version is 3.3. version.py:27
INFO cve_bin_tool - Getting NVD CVE data... nvd_source.py:327
INFO cve_bin_tool - Getting GitLab Advisory Database CVEs... gad_source.py:86
INFO cve_bin_tool.CVEDB - Rolling back the cache to its previous state cvedb.py:809
INFO cve_bin_tool - Getting RedHat CVEs... redhat_source.py:69
Downloading CVEs... ________________________________________ 100% 0:00:00
[07:01:26] INFO cve_bin_tool - Getting Open Source Vulnerability Database CVEs... osv_source.py:161
[07:01:51] ERROR cve_bin_tool - Unable to fetch OSV CVEs, skipping OSV. osv_source.py:376
________________________________ Traceback (most recent call last) _________________________________
_ /usr/local/bin/cve-bin-tool:8 in <module> _
_ _
_ 5 from cve_bin_tool.cli import main _
_ 6 if __name__ == '__main__': _
_ 7 _ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) _
_ _ 8 _ sys.exit(main()) _
_ 9 _
_ _
_ /usr/local/lib/python3.11/dist-packages/cve_bin_tool/cli.py:808 in main _
_ _
_ 805 _ _
_ 806 _ # update db if needed _
_ 807 _ if db_update != "never": _
_ _ 808 _ _ cvedb_orig.get_cvelist_if_stale() _
_ 809 _ else: _
_ 810 _ _ LOGGER.warning("Not verifying CVE DB cache") _
_ 811 _ _ if not cvedb_orig.check_cve_entries(): _
_ _
_ /usr/local/lib/python3.11/dist-packages/cve_bin_tool/cvedb.py:281 in get_cvelist_if_stale _
_ _
_ 278 _ _ _ datetime.datetime.today() _
_ 279 _ _ _ - datetime.datetime.fromtimestamp(self.dbpath.stat().st_mtime) _
_ 280 _ _ ) > datetime.timedelta(hours=24): _
_ _ 281 _ _ _ self.refresh_cache_and_update_db() _
_ 282 _ _ _ self.time_of_last_update = datetime.datetime.today() _
_ 283 _ _ else: _
_ 284 _ _ _ _ = self.get_db_update_date() _
_ _
_ /usr/local/lib/python3.11/dist-packages/cve_bin_tool/cvedb.py:268 in refresh_cache_and_update_db _
_ _
_ 265 _ _ _
_ 266 _ _ # if the database isn't open, open it _
_ 267 _ _ self.init_database() _
_ _ 268 _ _ self.populate_db() _
_ 269 _ _ self.LOGGER.debug("Updating exploits data.") _
_ 270 _ _ self.create_exploit_db() _
_ 271 _ _ self.update_exploits() _
_ _
_ /usr/local/lib/python3.11/dist-packages/cve_bin_tool/cvedb.py:483 in populate_db _
_ _
_ 480 _ _ _
_ 481 _ _ # EPSS uses metrics table to get the EPSS metric id. _
_ 482 _ _ # It can't be run before creation of metrics table. _
_ _ 483 _ _ self.populate_epss() _
_ 484 _ _ self.store_epss_data() _
_ 485 _ _ _
_ 486 _ _ for idx, data in enumerate(self.data): _
_ _
_ /usr/local/lib/python3.11/dist-packages/cve_bin_tool/cvedb.py:635 in populate_epss _
_ 632 _ _ Add EPSS data into the database""" _
_ 633 _ _ epss = epss_source.Epss_Source() _
_ 634 _ _ cursor = self.db_open_and_get_cursor() _
_ _ 635 _ _ self.epss_data = run_coroutine(epss.update_epss(cursor)) _
_ 636 _ _ self.db_close() _
_ 637 _ _
_ 638 _ def metric_finder(self, cursor, cve): _
_ _
_ /usr/local/lib/python3.11/dist-packages/cve_bin_tool/async_utils.py:90 in run_coroutine _
_ _
_ 87 _ """ _
_ 88 _ loop = get_event_loop() _
_ 89 _ aws = asyncio.ensure_future(coro, loop=loop) _
_ _ 90 _ result = loop.run_until_complete(aws) _
_ 91 _ return result _
_ 92 _
_ 93 _
_ _
_ /usr/lib/python3.11/asyncio/base_events.py:653 in run_until_complete _
_ _
_ 650 _ _ if not future.done(): _
_ 651 _ _ _ raise RuntimeError('Event loop stopped before Future completed.') _
_ 652 _ _ _
_ _ 653 _ _ return future.result() _
_ 654 _ _
_ 655 _ def stop(self): _
_ 656 _ _ """Stop running the event loop. _
_ _
_ /usr/local/lib/python3.11/dist-packages/cve_bin_tool/data_sources/epss_source.py:47 in _
_ update_epss _
_ _
_ 44 _ _ _ _ - EPSS percentile _
_ 45 _ _ """ _
_ 46 _ _ self.EPSS_id_finder(cursor) _
_ _ 47 _ _ await self.download_and_parse_epss() _
_ 48 _ _ return self.epss_data _
_ 49 _ _
_ 50 _ async def download_and_parse_epss(self): _
_ _
_ /usr/local/lib/python3.11/dist-packages/cve_bin_tool/data_sources/epss_source.py:52 in _
_ download_and_parse_epss _
_ _
_ 49 _ _
_ 50 _ async def download_and_parse_epss(self): _
_ 51 _ _ """Downloads and parses the EPSS data from the CSV file.""" _
_ _ 52 _ _ await self.download_epss_data() _
_ 53 _ _ self.epss_data = self.parse_epss_data() _
_ 54 _ _
_ 55 _ async def download_epss_data(self): _
_ _
_ /usr/local/lib/python3.11/dist-packages/cve_bin_tool/data_sources/epss_source.py:93 in _
_ download_epss_data _
_ _
_ 90 _ _ else: _
_ 91 _ _ _ try: _
_ 92 _ _ _ _ async with aiohttp.ClientSession(headers=HTTP_HEADERS) as session: _
_ _ 93 _ _ _ _ _ async with session.get(self.DATA_SOURCE_LINK) as response: _
_ 94 _ _ _ _ _ _ response.raise_for_status() _
_ 95 _ _ _ _ _ _ self.LOGGER.info("Getting EPSS data...") _
_ 96 _ _ _ _ _ _ decompressed_data = gzip.decompress(await response.read()) _
_ _
_ /usr/local/lib/python3.11/dist-packages/aiohttp/client.py:1197 in __aenter__ _
_ _
_ 1194 _ _ return self.__await__() _
_ 1195 _ _
_ 1196 _ async def __aenter__(self) -> _RetType: _
_ _ 1197 _ _ self._resp = await self._coro _
_ 1198 _ _ return self._resp _
_ 1199 _
_ 1200 _
_ _
_ /usr/local/lib/python3.11/dist-packages/aiohttp/client.py:507 in _request _
_ _
_ 504 _ _ _
_ 505 _ _ timer = tm.timer() _
_ 506 _ _ try: _
_ _ 507 _ _ _ with timer: _
_ 508 _ _ _ _ while True: _
_ 509 _ _ _ _ _ url, auth_from_url = strip_auth_from_url(url) _
_ 510 _ _ _ _ _ if auth and auth_from_url: _
_ _
_ /usr/local/lib/python3.11/dist-packages/aiohttp/helpers.py:735 in __exit__ _
_ _
_ 732 _ _ _ self._tasks.pop() _
_ 733 _ _ _
_ 734 _ _ if exc_type is asyncio.CancelledError and self._cancelled: _
_ _ 735 _ _ _ raise asyncio.TimeoutError from None _
_ 736 _ _ return None _
_ 737 _ _
_ 738 _ def timeout(self) -> None: _
____________________________________________________________________________________________________
TimeoutError
x: ~>
Version/platform info
Version of CVE-bin-tool( e.g. output of cve-bin-tool --version
): 3.3.1dev0
Installed from pypi or github? pypi
Operating system: Linux/Windows (other platforms are unsupported but feel free to report issues anyhow)
- On Linux (or Windows Subsystem for Linux) you can run
uname -a
x: ~>uname -a Linux x 6.9.0-rc1+ SMP PREEMPT_DYNAMIC Fri Mar 29 08:33:41 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Python version (e.g. python3 --version
): Python 3.11.6
Running in any particular CI environment we should know about? (e.g. Github Actions) No.
Darn, I was hoping that workaround will work.
Fixing this should just involve fixing it so we don't get a stack trace if there's a timeout error, but we might want to also double-check that there's nothing weird with the link we're using for epss data.
I've tried a couple of things but I can't seem to duplicate this myself easily even with an empty database. As tempting as it is to start messing with my routing table to see if I can make it happen, I think it probably makes more sense to just catch the timeout and hope we can work around it with me going in a bit blind? So I'm going to try to do that instead of bashing my head against it trying to figure out why it's happening.
So it turns out the reason --disable-data-source EPSS
didn't work is that EPSS wasn't loading in the same way as the other data sources. Le sigh. Fixing it so it could be disabled wasn't too hard, but fixing it so it behaves exactly like the other data sources (including better handling of the timeouts) will take a bit more refactoring. I'm testing a halfway between solution right now to see if it's worth having a halfway solution or if I should just refactor properly.
I've got a tentative fix in https://github.com/intel/cve-bin-tool/pull/4125 that may work for you. It should at least allow the EPSS data source to be disabled correctly as a workaround, as well as allow it to fail more gracefully. I haven't run it against the full test suite yet so it may need a bit of work still.
@tvyavaha, did you by any chance run your original attempts behind a proxy? You mentioned switching between networks, but just in case - was there one where HTTPS connections are not/need not be proxied?
I'm facing a similar timeout, and for me it's definitely the proxy, and the reason seems to be that EPSS data source code doesn't seem to support proxies (lines 72 and 92 need to be adjusted in the same way #923 did for cvedb.py
).
#4125 is focusing on making it possible to properly disable this data source, but that is a separate story and I wonder if the actual root cause is the same for the issue I observe, and the one you did. If so, I think there's no point in submitting yet another issue on this, and otherwise I'd do that.
@alext-w I reran the command without using a proxy, but it still fails at different points due to 'network unreachable' errors. `` x: ~>cve-bin-tool -l debug --nvd json-mirror -u now --disable-data-source EPSS [10:27:30] INFO cve_bin_tool - CVE Binary Tool v3.3.1dev0 cli.py:571 INFO cve_bin_tool - This product uses the NVD API but is not endorsed or certified by the NVD. cli.py:572 DEBUG cve_bin_tool - Processing disabled data sources ['EPSS'] cli.py:686 INFO cve_bin_tool - Disabling data source EPSS cli.py:693 DEBUG cve_bin_tool - Accepted disabled data sources ['EPSS'] cli.py:695 DEBUG cve_bin_tool.CVEDB - Creating backup of cachedir /home/tvyavaha/.cache/cve-bin-tool at /home/tvyavaha/.cache/cve-bin-tool-backup cvedb.py:786 [10:27:57] WARNING cve_bin_tool.CVEDB - Updating cachedir /home/tvyavaha/.cache/cve-bin-tool cvedb.py:665 [10:28:03] DEBUG cve_bin_tool.CVEDB - Updating CVE data. This will take a few minutes. cvedb.py:262 [10:37:03] ERROR cve_bin_tool - An error occurred while fetching https://pypi.org/pypi/cve-bin-tool/json: HTTPSConnectionPool(host='pypi.org', port=443): Max retries util.py:284 exceeded with url: /pypi/cve-bin-tool/json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f150ea552d0>: Failed to establish a new connection: [Errno 101] Network is unreachable')) WARNING cve_bin_tool - version.py:37 -------------------------- Can't check for the latest version --------------------------- warning: unable to access 'https://pypi.org/pypi/cve-bin-tool' Exception details: 'NoneType' object is not subscriptable Please make sure you have a working internet connection or try again later.
INFO cve_bin_tool - Getting NVD CVE data... nvd_source.py:327
INFO cve_bin_tool - Getting GitLab Advisory Database CVEs... gad_source.py:86
INFO cve_bin_tool.CVEDB - Rolling back the cache to its previous state cvedb.py:809
INFO cve_bin_tool - Getting RedHat CVEs... redhat_source.py:69
DEBUG cve_bin_tool - RedHat - Get request https://access.redhat.com/hydra/rest/securitydata/cve.json?after=2024-06-04 redhat_source.py:46
[10:39:18] DEBUG cve_bin_tool - Error while fetching GitLab Advisory Database CVEs : Cannot connect to host gitlab.com:443 ssl:default [Network is unreachable] gad_source.py:345 ERROR cve_bin_tool - Unable to fetch GitLab Advisory Database CVEs, skipping GAD. gad_source.py:346 ________________________________ Traceback (most recent call last) _________________________________ _ /usr/local/lib/python3.11/dist-packages/aiohttp/connector.py:1025 in _wrap_create_connection _ _ _ _ 1022 _ _ _ async with ceil_timeout( _ _ 1023 _ _ _ _ timeout.sock_connect, ceil_threshold=timeout.ceil_threshold _ _ 1024 _ _ _ ): _ _ _ 1025 _ _ _ _ return await self._loop.create_connection(*args, **kwargs) _ _ 1026 _ _ except cert_errors as exc: _ _ 1027 _ _ _ raise ClientConnectorCertificateError(req.connection_key, exc) from exc _ _ 1028 _ _ except ssl_errors as exc: _ _ _ _ /usr/lib/python3.11/asyncio/base_events.py:1085 in create_connection _ _ _ _ 1082 _ _ _ _ exceptions = [exc for sub in exceptions for exc in sub] _ _ 1083 _ _ _ _ try: _ _ 1084 _ _ _ _ _ if len(exceptions) == 1: _ _ _ 1085 _ _ _ _ _ _ raise exceptions[0] _ _ 1086 _ _ _ _ _ else: _ _ 1087 _ _ _ _ _ _ # If they all have the same str(), raise one. _ _ 1088 _ _ _ _ _ _ model = str(exceptions[0]) _ _ _ _ /usr/lib/python3.11/asyncio/base_events.py:1069 in create_connection _ _ _ _ 1066 _ _ _ _ # not using happy eyeballs _ _ 1067 _ _ _ _ for addrinfo in infos: _ _ 1068 _ _ _ _ _ try: _ _ _ 1069 _ _ _ _ _ _ sock = await self._connect_sock( _ _ 1070 _ _ _ _ _ _ _ exceptions, addrinfo, laddr_infos) _ _ 1071 _ _ _ _ _ _ break _ _ 1072 _ _ _ _ _ except OSError: _ _ _ _ /usr/lib/python3.11/asyncio/base_events.py:973 in _connect_sock _ _ _ _ 970 _ _ _ _ _ _ raise my_exceptions.pop() _ _ 971 _ _ _ _ _ else: _ _ 972 _ _ _ _ _ _ raise OSError(f"no matching local address with {family=} found") _ _ _ 973 _ _ _ await self.sock_connect(sock, address) _ _ 974 _ _ _ return sock _ _ 975 _ _ except OSError as exc: _ _ 976 _ _ _ my_exceptions.append(exc) _ _ _ _ /usr/lib/python3.11/asyncio/selector_events.py:634 in sock_connect _ _ _ _ 631 _ _ fut = self.create_future() _ _ 632 _ _ self._sock_connect(fut, sock, address) _ _ 633 _ _ try: _ _ _ 634 _ _ _ return await fut _ _ 635 _ _ finally: _ _ 636 _ _ _ # Needed to break cycles when an exception occurs. _ _ 637 _ _ _ fut = None _ _ _ _ /usr/lib/python3.11/asyncio/selector_events.py:642 in _sock_connect _ _ _ _ 639 _ def _sock_connect(self, fut, sock, address): _ _ 640 _ _ fd = sock.fileno() _ _ 641 _ _ try: _ _ _ 642 _ _ _ sock.connect(address) _ _ 643 _ _ except (BlockingIOError, InterruptedError): _ _ 644 _ _ _ # Issue #23618: When the C function connect() fails with EINTR, the _ _ 645 _ _ _ # connection runs in background. We have to wait until the socket _
OSError: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception: OSError: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
________________________________ Traceback (most recent call last) _________________________________
_ /usr/local/bin/cve-bin-tool:8 in RuntimeError
on missing start_tls()
. _
ClientConnectorError: Cannot connect to host mirror.cveb.in:443 ssl:default [Network is unreachable]
``
Let me see if I understand this correctly. By "reran the command without using a proxy", do you mean that you've used a network with direct Internet connection, where no proxy needs to be used? Alternatively, do you mean that the network does have a proxy, but you've deconfigured it before running the cve-bin-tool
? This latest error looks like some generic network problem, the proxy one usually manifests itself as a timeout.
More specifically, can you maybe try with all but EPSS and NVD data sources disabled to limit the number of network connections it tries to establsh and see using e.g., netstat
whether it successfully connects or at least attempts to?
Something like this:
$ cve-bin-tool -l debug -d CURL,GAD,OSV,REDHAT,RSD -u now <...rest of the command line...>
To illustrate, what I see is the following (with v3.3, the latest release, the network does have a proxy):
$ cve-bin-tool -l debug -d CURL,GAD,OSV,REDHAT,RSD -u now --report --detailed <...>
<...>
[15:04:30] DEBUG cve_bin_tool.CVEDB - Year 2022 has 24754 CVEs in dataset nvd_source.py:505
[15:04:34] DEBUG cve_bin_tool.CVEDB - Year 2023 has 28085 CVEs in dataset nvd_source.py:505
[15:04:36] DEBUG cve_bin_tool.CVEDB - Year 2024 has 12313 CVEs in dataset nvd_source.py:505
[15:04:37] DEBUG cve_bin_tool.CVEDB - Check database is using latest schema cvedb.py:323
DEBUG cve_bin_tool.CVEDB - Check database is using latest schema cvedb.py:323
DEBUG cve_bin_tool.CVEDB - Check database is using latest schema cvedb.py:323
DEBUG cve_bin_tool.CVEDB - Check database is using latest schema cvedb.py:323
DEBUG cve_bin_tool.CVEDB - Check database is using latest schema cvedb.py:323
<HERE IT STALLS FOR ABOUT 5 MINUTES, WITH NETSTAT INDICATING A DIRECT CONNECTION ATTEMPT, THEN FAILS WITH TimeoutError>
The netstat
output looks like this during the stall:
$ netstat -aeptn |grep SYN
tcp 0 1 <my IP>:47948 18.66.233.97:443 SYN_SENT 1000 113280 14591/python
After adding trust_env=True
at appropriate places in epss_source.py
(and with proxy configured with the usual HTTPS_PROXY
), the connection works fine and the DB update and then scan succeeds. Checking with netstat
additionally confirms it now does use the proxy.
@alext-w In my scenario, I discovered that a proxy setup was necessary for internet connectivity. When attempting to use the cve-bin-tool with the following command:
cve-bin-tool --nvd json-mirror -l debug -d CURL,GAD,OSV,REDHAT,RSD -u now -i installed_packages.txt
I encountered a TimeoutError
during the "Getting EPSS data..." phase. To resolve this issue,
I modified the epss_source.py file within the cve-bin-tool codebase to make aiohttp.ClientSession honor the proxy environment variables.
The changes I made are as follows:
x: ~/cve-bin-tool>git diff
diff --git a/cve_bin_tool/data_sources/epss_source.py b/cve_bin_tool/data_sources/epss_source.py
index 455dacfa..88505994 100644
--- a/cve_bin_tool/data_sources/epss_source.py
+++ b/cve_bin_tool/data_sources/epss_source.py
@@ -69,7 +69,7 @@ class Epss_Source:
# Check if the file is older than 24 hours
if time_difference > timedelta(hours=24):
try:
- async with aiohttp.ClientSession(headers=HTTP_HEADERS) as session:
+ async with aiohttp.ClientSession(trust_env=True) as session:
async with session.get(self.DATA_SOURCE_LINK) as response:
response.raise_for_status()
self.LOGGER.info("Getting EPSS data...")
@@ -89,7 +89,7 @@ class Epss_Source:
else:
try:
- async with aiohttp.ClientSession(headers=HTTP_HEADERS) as session:
+ async with aiohttp.ClientSession(trust_env=True) as session:
async with session.get(self.DATA_SOURCE_LINK) as response:
response.raise_for_status()
By changing the aiohttp.ClientSession initialization to include trust_env=True
, the session now respects the system's proxy environment variables.
This modification allowed the cve-bin-tool command to execute successfully without encountering the TimeoutError while fetching EPSS data.
Ok, thank you. Indeed then it looks like we have a common root cause here and that is the lack of proxy support for the EPSS source, which should be a two-line fix.
That is great news!
That said, I do want to make EPSS a source one can disable correctly, so I"ll finish up #4125 too even though it sounds like it won't be necessary to fix this issue.
Thank you and yes, I think there's certainly value in being able to disable the EPSS source, so that other fix is much appreciated as well!