cve-bin-tool icon indicating copy to clipboard operation
cve-bin-tool copied to clipboard

fix: cvedb metric refactoring

Open Saksham-Sirohi opened this issue 9 months ago • 7 comments

This PR addresses the issue of handling unknown values in metrics by ensuring the UNKNOWN metric is properly initialized and validated in the CVE database. Key changes include:

✔️ Added UNKNOWN_METRIC_ID to populate_metrics to ensure the "UNKNOWN" metric is inserted into the metrics table. ✔️ Enhanced schema validation in latest_schema to check for the existence of the UNKNOWN metric, triggering a refresh if missing. ✔️ Updated metrics during refresh to ensure metrics are re-populated if the schema is outdated.

These changes ensure that the database consistently handles unknown values and maintains up-to-date metrics, improving reliability and correctness.

Fixes #4812

Saksham-Sirohi avatar Mar 19 '25 20:03 Saksham-Sirohi

@22f1001635 What do you think about creating a constant for the metrics data, used in populate_metrics:

https://github.com/intel/cve-bin-tool/blob/8f9043ae91bcf7ae7a55e6e881fe470826cc12a2/cve_bin_tool/cvedb.py#L645-L650

like:

METRICS = [
        (UNKNOWN_METRIC_ID, "UNKNOWN"),
        (EPSS_METRIC_ID, "EPSS"),
        (CVSS_2_METRIC_ID, "CVSS-2"),
        (CVSS_3_METRIC_ID, "CVSS-3"),
]

and use the new constant in populate_metrics and in latest_schema for iteration like that:

        # Check for metrics table data
        if table_name == "metrics":
			for metrics_id, metrics_name in METRICS:
                result = cursor.execute(
                    "SELECT * FROM metrics WHERE metrics_id=? AND metrics_name", (metrics_id, metrics_name)
               )
               if not result.fetchone():
                   schema_latest = False

This has the benefit that it's simpler to add additional METRICS in the future like SSVC in example.

jloehel avatar Mar 19 '25 21:03 jloehel

Hi @jloehel , I have made the required changes. Can you take a look and let me know

Saksham-Sirohi avatar Mar 20 '25 07:03 Saksham-Sirohi

Hi @jloehel , I have made the required changes. Can you take a look and let me know

Hi :-) Thanks. I have read the code again and I am not sure if the condition is at the right place because self.refresh_cache_and_update_db() gets still only executed if the db does not exist, the db is older than 24 days and the latest_schema is not matching. I think the condition needs to go here:

https://github.com/intel/cve-bin-tool/blob/04c47b803dff6a08913f4863ebbdb706b1bb6dbb/cve_bin_tool/cvedb.py#L328-L336

... and you don't need to call populate_metrics again. It gets called in populate_db already. Only the condition when the database gets updated needs to get modified. Sorry, I should have checked this earlier.

jloehel avatar Mar 20 '25 18:03 jloehel

@jloehel made changes; take a look and let me know if anything else is needed

Saksham-Sirohi avatar Mar 20 '25 19:03 Saksham-Sirohi

@22f1001635 What do you think about a test case for this scenario?

  • Database exists, is not older than 24 hours and the schemas are all correct, but the UNKNOWN_METRIC is missing.
  • Update the database
  • Check if the UNKNOWN_METRIC exists after the update

jloehel avatar Mar 25 '25 16:03 jloehel

hi @jloehel I have made a test case covering this do take a look and let me know your thoughts

Saksham-Sirohi avatar Apr 20 '25 17:04 Saksham-Sirohi

Hi @jloehel @terriko, please take a look at the changes and let me know if anything else is needed

Saksham-Sirohi avatar Aug 01 '25 17:08 Saksham-Sirohi