cve-bin-tool test: failing epss tests

The following tests are failing:

=========================== short test summary info ============================
FAILED test/test_cli.py::TestCLI::test_EPSS_probability - AssertionError: assert not 'haxx,curl,7.20.0,CVE-2011-2192,MEDIUM,4.3,REDHAT,2,AV:N/AC:M/Au:N/C:P/I:N/A:N,/tmp/tmpa3x5_cg9/curl-7.20.0-4.fc13.x86_64.rpm contains /usr/bin/curl,NewFound,,0.00228,0.61008'
FAILED test/test_cli.py::TestCLI::test_EPSS_percentile - AssertionError: assert not 'haxx,curl,7.20.0,CVE-2011-2192,MEDIUM,4.3,REDHAT,2,AV:N/AC:M/Au:N/C:P/I:N/A:N,/tmp/tmpa3x5_cg9/curl-7.20.0-4.fc13.x86_64.rpm contains /usr/bin/curl,NewFound,,0.00228,0.61008'
======== 2 failed, 22 passed, 5 skipped, 1 warning in 96.51s (0:01:36) =========

Not sure offhand if they're due to an epss data change or what. Pinging @Rexbeast2 and @anthonyharrison in case either of you knows of epss changes we should care about or wants to take a look at it.

Dec 28 '23 20:12 terriko

I've added a temporary fix disabling the failing code (and only the failing code, not the whole test) but this needs a real fix later

Dec 28 '23 22:12 terriko

hey @terriko , I have been looking into this issue and found some ambiguity that i would like to discuss. For the failing test_EPSS_probability ,its failing due to this block of lines here :

with caplog.at_level(logging.DEBUG):
            main(
                [
                    "cve-bin-tool",
                    "-x",
                    "--epss-probability",
                    "110",
                    "-f",
                    "csv",
                    "-o",
                    my_test_filename,
                    str(Path(self.tempdir) / CURL_7_20_0_RPM),
                ]
            )
        # Verify that no CVEs are reported
        with open(my_test_filename_pathlib) as fd:
            assert not fd.read().split("\n")[1]
        caplog.clear()
        if my_test_filename_pathlib.exists():
            my_test_filename_pathlib.unlink()

This runs the cve-bin-tool with EPSS probability 110, then the test checks that the output file does not contain any CVE information. This is verified by reading the second line of the file and ensuring it is empty.

In cli.py file here :

epss_probability = 0
   if (
       args["epss_probability"]
       and float(args["epss_probability"]) >= 0
       and float(args["epss_probability"]) <= 100
   ):
       epss_probability = float(args["epss_probability"]) / 100
       LOGGER.debug(f"epss probability stored {epss_probability}")
   elif args["epss_probability"]:
       LOGGER.debug(
           f'epss probability {args["epss_probability"]} is invalid so set it to 0'
       )

This block of code ensures that whenever the epss_probabilty is not in range of (0,100), then its value is set to 0 by default. Since in the test above the epss_probability given as argument is 110, its outside the range and so its value is set to 0.

Now with epss_probability = 0 ,when the tool runs it finds all cves that have a minimum epss_probability of 0 (many of which exists in the test package) and it finds them and write them to the output file .

So the expected behaviour should be that the output file will contain many cves for input 110 (since it will basically output all cves it can find for 0 prob value). Then why we're checking if the output_file is empty for 110 value ?

Dec 29 '23 14:12 ayushthe1

@terriko Just a reminder in case you missed my above post.

Jan 04 '24 18:01 ayushthe1

@ayushthe1 thanks for the analysis.

So, we could fake it by having 110 be an allowed value just for testing, or this test may require some re-thinking. Any ideas now that you've looked at it?

Jan 04 '24 18:01 terriko

@terriko I think if we need to check for 110 value then we should assert if the output file containing the cves have strings like "MEDIUM" or "HIGH" ,similar to what we do for probability value of -12.

Changing the actual fuction to make 110 as an allowed value just for testing purpose doesn't seem like a good idea to me coz anyway we would have to convert 110 to a valid value between (0,100).

Another limitation with this test_epss_probablity is that we can't check or assert on things based on the epss_probabilty as this value keeps changing frequently. (otherwise if we know that our test package contains 'x' cves above 0.6 epss_prob ,we could just assert that for 60 value ,we should get 'x' cves)

Happy to get any further suggestions or ideas from your side.

Jan 08 '24 11:01 ayushthe1

The check for 110 is to ensure that an out of range EPSS value is ignored. This should always be the case even if the EPSS values change.

On Mon, 8 Jan 2024, 11:38 Ayush Sharma, @.***> wrote:

@terriko https://github.com/terriko I think if we need to check for 110 value then we should assert if the output file containing the cves have strings like "MEDIUM" or "HIGH" ,similar to what we do for probability value of -12.

Changing the actual fuction to make 110 as an allowed value just for testing purpose doesn't seem like a good idea to me coz anyway we would have to convert 110 to a valid value between (0,100).

Another limitation with this test_epss_probablity is that we can't check or assert on things based on the epss_probabilty as this value keeps changing frequently. (otherwise if we know that our test package contains 'x' cves above 0.6 epss_prob ,we could just assert that for 60 value ,we should get 'x' cves)

Happy to get any further suggestions or ideas from your side.

— Reply to this email directly, view it on GitHub https://github.com/intel/cve-bin-tool/issues/3674#issuecomment-1880837134, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACAID275COSE2UFQW4BSNXDYNPLBTAVCNFSM6AAAAABBF3NKM6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBQHAZTOMJTGQ . You are receiving this because you were mentioned.Message ID: @.***>

Jan 08 '24 12:01 anthonyharrison

The check for 110 is to ensure that an out of range EPSS value is ignored. This should always be the case even if the EPSS values change.

On Mon, 8 Jan 2024, 11:38 Ayush Sharma, @.***> wrote:

@terriko https://github.com/terriko I think if we need to check for 110 value then we should assert if the output file containing the cves have strings like "MEDIUM" or "HIGH" ,similar to what we do for probability value of -12.

Changing the actual fuction to make 110 as an allowed value just for testing purpose doesn't seem like a good idea to me coz anyway we would have to convert 110 to a valid value between (0,100).

Another limitation with this test_epss_probablity is that we can't check or assert on things based on the epss_probabilty as this value keeps changing frequently. (otherwise if we know that our test package contains 'x' cves above 0.6 epss_prob ,we could just assert that for 60 value ,we should get 'x' cves)

Happy to get any further suggestions or ideas from your side.

— Reply to this email directly, view it on GitHub https://github.com/intel/cve-bin-tool/issues/3674#issuecomment-1880837134, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACAID275COSE2UFQW4BSNXDYNPLBTAVCNFSM6AAAAABBF3NKM6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBQHAZTOMJTGQ . You are receiving this because you were mentioned.Message ID: @.***>

Agree @anthonyharrison ,but for 110 value which is outside the range of (0-100) the default epss_probability is set to 0. So when we test for 110 value we'll actually get all the cves present in the test package. So to test this shouldn't we check if the output file(which contains all the cves found in test package with prob above the epss_probability ) contains strings like "Medium" or "high" to verify that the file isn't empty. Currently we expect the output file to be empty for 110 value.

Jan 08 '24 17:01 ayushthe1

I have just looked at the code in cli.py and if the epss_probability or percentile is out of range, a default value of 0 is assumed (which is OK if no epss parameters are specified). I do wonder if this is the correct way to proceed as we don't do this for other parameters e.g. cvss which is just checked for a positive value.

Jan 08 '24 20:01 anthonyharrison

The default value is new here, incidentally, so if it needs to be changed to something different then it would be nice if we did it before the release. I don't have strong feelings about it, but I don't actually use EPSS data in my work right now so I'm probably not the right person to ask.

Jan 09 '24 17:01 terriko

@anthonyharrison @terriko Could you suggest any ideas on how we should move forward on dealing with this issue. Happy to work on it.

Jan 13 '24 15:01 ayushthe1

@ayushthe1 I don't know the best way to handle this, because EPSS values change fairly regularly and we don't want the test to break.

Probably the ideal solution would be to use mock to fake a specific value in the database and have the code return that one.
A more lazy solution would be to choose an EPSS value where you expect some results returned and check for a number of results.

For option 2: Try --epss-probability 5 or something similarly low until you get it returning a few results, then re-write this section :

        # Verify that no CVEs are reported
        with open(my_test_filename_pathlib) as fd:
            assert not fd.read().split("\n")[1]

So instead of looking for no results, it looks for whatever results you found when you ran it manually. Don't be too specific because the epss data changes, but something like "I expect at least 1 result" will likely not need changing regularly.

Jan 17 '24 20:01 terriko