malicious-software-packages-dataset icon indicating copy to clipboard operation
malicious-software-packages-dataset copied to clipboard

[SECRES-3945] Separate manifest file and samples sync workflows

Open ikretz opened this issue 2 months ago • 0 comments

This PR separates the samples syncing and manifest file syncing into two separate workflows.

A new script, scripts/sync-manifest/ has been added, along with a corresponding new workflow, .github/workflows/sync-manifest.yaml. As an initial trial period, this workflow is set to run every 2 hours on weekdays from 9h-17h UTC, with a manual PR approval required. The workflow runs the new sync_manifest script to sync the manifest files directly with the backend.

Other changes include:

  • Remove the scripts/generate_manifest/ script that was previously responsible for generating the manifest files from the dataset contents
  • Update the sync-malicious-packages workflow to no longer perform any manifest-related operations
  • Perform a one-off, all-time sync of the manifests with the backend (picked up several hundred new items)
  • Remove a sample npm/compromised-libs/xrlp/4.3.0 that should not have been included in the dataset

ikretz avatar Oct 28 '25 15:10 ikretz