docs-csm icon indicating copy to clipboard operation
docs-csm copied to clipboard

CASM-4908 Runtime container image signature validation

Open mtupitsyn opened this issue 4 months ago • 0 comments

Summary and Scope

During initial testing of image signature validation, it was discovered that Kyverno tries to contact https://artifactory.alogl60.net/ for image verification, and this blocks deployments in air-gapped environments, even in Audit mode (CASMTRIAGE-7283). We need to set Kyverno to contact local registry instead, for both images and their respective signatures. This will allow us to turn on signature validation in runtime (during initial deployments, upgrades and in background on running clusters).

Proposed solution involves these key steps:

  • Deploy new Kyverno cluster policy prepend-registry, which will automatically add registry.local/ to the beginning of image spec for any new pod (if it doesn't already start with registry.local/).
  • Add a mirroring rule to containerd configuration, so that images with names starting from registry.local/ are looked in https://pit.nmn first and in https://registry.local/ second. This rule is needed to support a switch from PIT Nexus to Cloud Nexus during initial install. It is similar to already existing rule for image names starting from artifactory.algol60.net, which now becomes obsolete.
  • Move Kyverno and policies deployment into separate manifest, and deploy it early in install/upgrade pipeline, thus ensuring that image name mutation and signature validation happen to all deployments after Kyverno.
  • For the duration of fresh install, when images are downloaded from PIT Nexus, put a temporary hosts record override into CoreDNS ConfigMap. This override will point to PIT Nexus instead of Cloud Nexus. It is needed for Kyverno admission controller to look for images and their signatures in the right location during fresh install (when Cloud Nexus is not yet deployed).

This change consists of the following PR's:

  • https://github.com/Cray-HPE/csm/pull/3703
  • https://github.com/Cray-HPE/docs-csm/pull/5455
  • https://github.com/Cray-HPE/node-images/pull/1172
  • https://github.com/Cray-HPE/cray-kyverno-policies/pull/17

Issues and Related PRs

Testing

Tested on:

  • Virtual Shasta

Test description:

  • Created custom builds of CSM and docs-csm with changes outlined above
  • Performed multiple automated deployments on vShasta in different combinations: fresh install and upgrade, with validationFailureAction set to Audit and Enforce.

Risks and Mitigations

None known ATM.

Pull Request Checklist

  • [x] Version number(s) incremented, if applicable
  • [x] Copyrights updated
  • [x] License file intact
  • [x] Target branch correct
  • [x] Testing is appropriate and complete, if applicable
  • [x] HPC Product Announcement prepared, if applicable

mtupitsyn avatar Oct 10 '24 18:10 mtupitsyn