CISA Data Importer via GitHub Repo- Added
Add CISA GOV Vulnrichment Importer
This pull request adds a new importer for the CISA GOV Vulnrichment dataset. The importer fetches vulnerability data from the CISAGOV/vulnrichment GitHub repository and imports it into our database.
Related Issue
Closes #1475
Changes
- Added a new
VulnrichImporterclass invulnerabilities/importers/cisagov.py - Implemented methods to fetch and parse advisory data from the CISAGOV GitHub repository
- Added deduplication logic to prevent unnecessary updates to existing records
- Integrated the new importer with the existing import system
How to Use
To use the new importer, run the following management command:
python manage.py import vulnerabilities.importers.vulnrichment.VulnrichImporter
This command will fetch the latest data from the CISAGOV/vulnrichment repository and import it into the database.
Features
- Fetches vulnerability data from the CISAGOV/vulnrichment GitHub repository
- Parses CVE data, including CVE ID, summary, references, and weaknesses
- Extracts severity scores
- Implements content-based deduplication to avoid unnecessary updates
- Logs skipped advisories for transparency
Testing
- Tested the importer with a sample dataset from the CISAGOV repository
- Verified that duplicate entries are not created when running the importer multiple times
- Checked that updates are only applied when the advisory content has changed
Additional Notes
- The importer uses the GitHub API to fetch repository content. Ensure that the necessary API rate limits are considered for production use.
- The
GITHUB_API_BASE,REPO_OWNER,REPO_NAME, andBRANCHconstants in the importer can be adjusted if the source repository changes in the future.
Please review and let me know if any changes or additional information is needed.
@keshav-space should we now instead use the new importer pipeline approach?
@keshav-space should we now instead use the new importer pipeline approach?
@pombredanne yes, we have a decent number of pipelines here https://github.com/aboutcode-org/vulnerablecode/tree/main/vulnerabilities/pipelines, and there is a brief instruction on how to write a pipeline here https://github.com/aboutcode-org/vulnerablecode/pull/1589#discussion_r1756470589. I still need to add this to our tutorials in Read the Docs.
@keshav-space Can you please tell me that what is the difference between the importer pipeline approach and the normal importing?