fix: cvedb metric refactoring
This PR addresses the issue of handling unknown values in metrics by ensuring the UNKNOWN metric is properly initialized and validated in the CVE database. Key changes include:
✔️ Added UNKNOWN_METRIC_ID to populate_metrics to ensure the "UNKNOWN" metric is inserted into the metrics table. ✔️ Enhanced schema validation in latest_schema to check for the existence of the UNKNOWN metric, triggering a refresh if missing. ✔️ Updated metrics during refresh to ensure metrics are re-populated if the schema is outdated.
These changes ensure that the database consistently handles unknown values and maintains up-to-date metrics, improving reliability and correctness.
Fixes #4812
@22f1001635 What do you think about creating a constant for the metrics data, used in populate_metrics:
https://github.com/intel/cve-bin-tool/blob/8f9043ae91bcf7ae7a55e6e881fe470826cc12a2/cve_bin_tool/cvedb.py#L645-L650
like:
METRICS = [
(UNKNOWN_METRIC_ID, "UNKNOWN"),
(EPSS_METRIC_ID, "EPSS"),
(CVSS_2_METRIC_ID, "CVSS-2"),
(CVSS_3_METRIC_ID, "CVSS-3"),
]
and use the new constant in populate_metrics and in latest_schema for iteration like that:
# Check for metrics table data
if table_name == "metrics":
for metrics_id, metrics_name in METRICS:
result = cursor.execute(
"SELECT * FROM metrics WHERE metrics_id=? AND metrics_name", (metrics_id, metrics_name)
)
if not result.fetchone():
schema_latest = False
This has the benefit that it's simpler to add additional METRICS in the future like SSVC in example.
Hi @jloehel , I have made the required changes. Can you take a look and let me know
Hi @jloehel , I have made the required changes. Can you take a look and let me know
Hi :-) Thanks. I have read the code again and I am not sure if the condition is at the right place because self.refresh_cache_and_update_db() gets still only executed if the db does not exist, the db is older than 24 days and the latest_schema is not matching. I think the condition needs to go here:
https://github.com/intel/cve-bin-tool/blob/04c47b803dff6a08913f4863ebbdb706b1bb6dbb/cve_bin_tool/cvedb.py#L328-L336
... and you don't need to call populate_metrics again. It gets called in populate_db already. Only the condition when the database gets updated needs to get modified. Sorry, I should have checked this earlier.
@jloehel made changes; take a look and let me know if anything else is needed
@22f1001635 What do you think about a test case for this scenario?
- Database exists, is not older than 24 hours and the schemas are all correct, but the UNKNOWN_METRIC is missing.
- Update the database
- Check if the UNKNOWN_METRIC exists after the update
hi @jloehel I have made a test case covering this do take a look and let me know your thoughts
Hi @jloehel @terriko, please take a look at the changes and let me know if anything else is needed