malicious-software-packages-dataset
malicious-software-packages-dataset copied to clipboard
[SECRES-3945] Separate manifest file and samples sync workflows
This PR separates the samples syncing and manifest file syncing into two separate workflows.
A new script, scripts/sync-manifest/ has been added, along with a corresponding new workflow, .github/workflows/sync-manifest.yaml. As an initial trial period, this workflow is set to run every 2 hours on weekdays from 9h-17h UTC, with a manual PR approval required. The workflow runs the new sync_manifest script to sync the manifest files directly with the backend.
Other changes include:
- Remove the
scripts/generate_manifest/script that was previously responsible for generating the manifest files from the dataset contents - Update the
sync-malicious-packagesworkflow to no longer perform any manifest-related operations - Perform a one-off, all-time sync of the manifests with the backend (picked up several hundred new items)
- Remove a sample
npm/compromised-libs/xrlp/4.3.0that should not have been included in the dataset