test: improve performance on our slowest tests
in #4319 I'm switching pytest to print our longest duration tests so we can see about improving the performance of our test suite. On a random local run, here's what I saw
======================================== slowest 50 durations ========================================
291.38s call test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/Cargo.lock-products1]
203.69s call test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/Gemfile.lock-products2]
119.99s call test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/requirements.txt-products3]
99.68s call test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/renv.lock-products0]
49.71s call test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/go.mod-products6]
38.39s call test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/package-lock.json-products4]
23.17s call test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/cpanfile-products9]
11.63s call test/test_requirements.py::test_requirements
11.18s call test/test_cli.py::TestCLI::test_EPSS_percentile
11.14s call test/test_cli.py::TestCLI::test_EPSS_probability
10.74s call test/test_language_scanner.py::TestLanguageScanner::test_java_package[/home/terri/Code/cve-bin-tool/test/language_data/pom.xml-product_list0]
9.26s setup test/test_available_fix.py::TestAvailableFixReport::test_debian_backport_fix_output
7.81s call test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/Package.resolved-products7]
7.31s call test/test_language_scanner.py::TestLanguageScanner::test_language_package_none_found[/home/terri/Code/cve-bin-tool/test/language_data/fail_pom.xml]
5.22s call test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/.package-lock.json-products5]
4.83s call test/test_cli.py::TestCLI::test_sbom_detection
4.83s call test/test_cli.py::TestCLI::test_CVSS_score
4.82s call test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/composer.lock-products8]
4.53s call test/test_cli.py::TestCLI::test_SBOM
3.82s call test/test_source_purl2cpe.py::TestSourceOSV::test_db_contents[1-False]
2.20s call test/test_language_scanner.py::TestLanguageScanner::test_language_package[/home/terri/Code/cve-bin-tool/test/language_data/pubspec.lock-products10]
2.18s call test/test_cli.py::TestCLI::test_disabled_sources
2.12s setup test/test_cli.py::TestCLI::test_extract_bad_zip_messages
2.01s setup test/test_cli.py::TestCLI::test_sbom_detection
1.65s setup test/test_cli.py::TestCLI::test_EPSS_probability
1.59s setup test/test_cli.py::TestCLI::test_SBOM
1.49s call test/test_output_engine.py::TestOutputEngine::test_output_file_wrapper
1.44s call test/test_cli.py::TestCLI::test_severity
1.39s setup test/test_cli.py::TestCLI::test_EPSS_percentile
1.31s call test/test_cli.py::TestCLI::test_quiet_mode
1.13s call test/test_merge.py::TestMergeReports::test_valid_merge[filepaths0-merged_data0]
1.00s setup test/test_cli.py::TestCLI::test_extract_encrypted_zip_messages
0.99s setup test/test_html.py::TestOutputHTML::test_interactive_mode_print_mode_switching[chromium]
0.95s call test/test_language_scanner.py::TestLanguageScanner::test_javascript_package_none_found[/home/terri/Code/cve-bin-tool/test/language_data/fail-package-lock.json]
0.89s setup test/test_cli.py::TestCLI::test_disabled_sources
0.83s setup test/test_cli.py::TestCLI::test_config_generator[args0-config.yaml-expected_contents0]
0.80s call test/test_sbom.py::TestSBOM::test_invalid_xml[/home/terri/Code/cve-bin-tool/test/sbom/spdx_test.spdx.xml-cyclonedx-True]
0.73s setup test/test_cli.py::TestCLI::test_config_generator[args1-config.toml-expected_contents1]
0.70s call test/test_cli.py::TestCLI::test_extract_bad_zip_messages
0.68s call test/test_cli.py::TestCLI::test_runs
0.66s call test/test_sbom.py::TestSBOM::test_invalid_xml[/home/terri/Code/cve-bin-tool/test/sbom/swid_test.xml-cyclondedx-True]
0.65s call test/test_cli.py::TestCLI::test_skips
0.64s call test/test_merge.py::TestMergeReports::test_valid_cve_scanner_instance[filepaths0]
0.60s call test/test_sbom.py::TestSBOM::test_sbom_detection[/home/terri/Code/cve-bin-tool/test/sbom/cyclonedx_test.xml-cyclonedx]
0.55s setup test/test_cli.py::TestCLI::test_invalid_parameter
0.55s setup test/test_cli.py::TestCLI::test_version
0.55s setup test/test_cli.py::TestCLI::test_severity
0.55s setup test/test_cli.py::TestCLI::test_quiet_mode
0.54s setup test/test_cli.py::TestCLI::test_skips
0.54s setup test/test_cli.py::TestCLI::test_usage
====================================== short test summary info =======================================
It looks like our language scanner tests are noticeably slower on my machine. If I had to guess, the primary problem is likely due to the sheer number of products and vulnerabilities those tests look up, so I would start by reducing the test files to look up a minimal number of products and make sure that the products that they look up have a minimal number of vulnerabilities. Exactly how many products you should keep will depend on what's needed to test different parsing and to conform to however a full lock file with dependencies should look for the language, but if you can get enough test coverage with 1 product that has 1 vulnerability, go for it!
It's entirely possible that there's also performance gains to be had in the language scanner code if you want to do a deeper dive there too!
Hi @terriko , this looks like an interesting issue! I’d love to help speed up the tests. I’ll start by checking the longest-running ones and see if we can reduce the number of product lookups while keeping the coverage solid. Also, I’ll take a look at the language scanner to see if there are any performance tweaks we can make. Let me know if there are any specific things I should keep in mind. Excited to contribute!
@Gyan-max thanks! I think reducing the product lookups is going to make the biggest difference even if we make other performance tweaks, so probably start there.
@terriko To improve the performance of our test suite, I propose the following solutions:
- Reduce Test File Sizes
- Limit the number of products in each test file to the minimum required for effective parsing validation.
- Ensure each product has only one vulnerability where possible.
- Optimize Vulnerability Lookups
- Investigate if unnecessary lookups are being performed. 2.Consider caching or mocking responses to reduce execution time.
-
Enable Parallel Execution 1.Utilize pytest-xdist (pytest -n auto) to run tests in parallel.
-
Profile and Optimize Code
1.Use pytest --durations=10 --profile to identify performance bottlenecks. 2.Optimize the parsing logic and data structures in language_scanner.py for efficiency.
These steps should help reduce execution time while maintaining test coverage.
Hi @terriko, I'd like to work on this issue as part of my GSoC preparation.
Could you assign it to me? I will analyze the test performance and suggest improvements.
Thanks!
Hi, I’ve opened a PR that addresses this by adding lazy CVE DB initialization and language short-circuiting in scan_file, along with minimizing the fail_pom.xml fixture. The language POM test went from 0.18 s → 0.13 s on my machine.
PR: #5390