vulnerablecode icon indicating copy to clipboard operation
vulnerablecode copied to clipboard

CISA Data Importer via GitHub Repo- Added

Open Rishi-source opened this issue 1 year ago • 3 comments

Add CISA GOV Vulnrichment Importer

This pull request adds a new importer for the CISA GOV Vulnrichment dataset. The importer fetches vulnerability data from the CISAGOV/vulnrichment GitHub repository and imports it into our database.

Related Issue

Closes #1475

Changes

  • Added a new VulnrichImporter class in vulnerabilities/importers/cisagov.py
  • Implemented methods to fetch and parse advisory data from the CISAGOV GitHub repository
  • Added deduplication logic to prevent unnecessary updates to existing records
  • Integrated the new importer with the existing import system

How to Use

To use the new importer, run the following management command:

python manage.py import vulnerabilities.importers.vulnrichment.VulnrichImporter

This command will fetch the latest data from the CISAGOV/vulnrichment repository and import it into the database.

Features

  • Fetches vulnerability data from the CISAGOV/vulnrichment GitHub repository
  • Parses CVE data, including CVE ID, summary, references, and weaknesses
  • Extracts severity scores
  • Implements content-based deduplication to avoid unnecessary updates
  • Logs skipped advisories for transparency

Testing

  • Tested the importer with a sample dataset from the CISAGOV repository
  • Verified that duplicate entries are not created when running the importer multiple times
  • Checked that updates are only applied when the advisory content has changed

Additional Notes

  • The importer uses the GitHub API to fetch repository content. Ensure that the necessary API rate limits are considered for production use.
  • The GITHUB_API_BASE, REPO_OWNER, REPO_NAME, and BRANCH constants in the importer can be adjusted if the source repository changes in the future.

Please review and let me know if any changes or additional information is needed.

Rishi-source avatar Oct 15 '24 07:10 Rishi-source

@keshav-space should we now instead use the new importer pipeline approach?

pombredanne avatar Oct 15 '24 08:10 pombredanne

@keshav-space should we now instead use the new importer pipeline approach?

@pombredanne yes, we have a decent number of pipelines here https://github.com/aboutcode-org/vulnerablecode/tree/main/vulnerabilities/pipelines, and there is a brief instruction on how to write a pipeline here https://github.com/aboutcode-org/vulnerablecode/pull/1589#discussion_r1756470589. I still need to add this to our tutorials in Read the Docs.

keshav-space avatar Oct 15 '24 08:10 keshav-space

@keshav-space Can you please tell me that what is the difference between the importer pipeline approach and the normal importing?

Rishi-source avatar Oct 15 '24 10:10 Rishi-source