che icon indicating copy to clipboard operation
che copied to clipboard

Use digests instead of tags to reference images

Open l0rd opened this issue 5 years ago • 18 comments

Is your enhancement related to a problem?

Images tags can reference an image today and a different one tomorrow. That makes our plugin and devfile registries builds not repeatable. We have the same kind of problem with the references to git repos. In this case it's for containers.

Using the digests would have the following benefits:

  • make the build of the registries repeatable (with that as well)
  • reducing the number of builds and push of images (no changes --> no build/push)
  • reduce the number of replacements we need to do in devfiles and meta.yaml when preparing a release

Describe the solution you'd like

Replacing tags with digests in:

  • [x] che dockerfiles
  • [x] plugin registry
  • [x] devfile registry
  • [ ] properties files

l0rd avatar Feb 14 '20 19:02 l0rd

@l0rd could you please put the priority and team for this task. I suspect this is P1 since this feature is needed for the product

ibuziuk avatar Feb 17 '20 08:02 ibuziuk

@ibuziuk done. By the way this is not needed for the product. It's an improvement that will make Che more reliable. I have also affected the label kind/epic because it will need multiple teams to work on and probably a few iterations.

l0rd avatar Feb 17 '20 08:02 l0rd

properties files

What properties files do we have in Che that refer to images?

che dockerfiles

Can you give an example of this?

In CRW we only use tags converted to digests in three places:

  • plugin registry
  • devfile registry
  • operator metadata's latest CSV file

These are updated at build time from the latest tag via these scripts:

  • https://github.com/redhat-developer/codeready-workspaces/blob/master/dependencies/che-plugin-registry/build/scripts/write_image_digests.sh
  • https://github.com/redhat-developer/codeready-workspaces/blob/master/dependencies/che-devfile-registry/build/scripts/write_image_digests.sh
  • https://github.com/redhat-developer/codeready-workspaces-operator/blob/master/build/scripts/addDigests.sh and https://github.com/redhat-developer/codeready-workspaces-operator/blob/master/build/scripts/buildDigestMap.sh

Perhaps the best solution here is to implement the same thing in upstream, so every container build of registries and every csv release uses specific digests, but the code in GH still contains the tag references?

That way you have BOTH the snapshot with the specific, permanent reference to a specific image, but also the code with the moving tag in case you want to rebuild it against a newer nightly/latest.

nickboldt avatar Apr 28 '20 13:04 nickboldt

A user is asking about image upgrades. (i.e. some workspace image gets a CVE upgrade.) To make use of the improved image:

A) The user should upgrade the entire product, via che(crw)ctl server:upgrade B) The user should manually hack plugin, devfile, operator metadata C) We will provide some documented way of doing this D) Other

(Note: Docker build processes above don't seem workable on RHEL 7)

@nickboldt @l0rd

RickJWagner avatar May 06 '20 15:05 RickJWagner

@nickboldt 4 images are specified in che.properties, every Dockerfile in our repos have a FROM <base-image> where the <base-image> should be specified using a digest.

Perhaps the best solution here is to implement the same thing in upstream, so every container build of registries and every csv release uses specific digests, but the code in GH still contains the tag references?

That approach doesn't solve the main problem here: we want repeatable builds when we do a release. When our source code doesn't change we want the result of the build to be identical. The example of issue that we want to fix is when che.openshift.io got broken after an update because we were referring quarkus nightly image as nightly and that image had "silently" changed.

That way you have BOTH the snapshot with the specific, permanent reference to a specific image, but also the code with the moving tag in case you want to rebuild it against a newer nightly/latest.

That's exactly what we want to avoid.

l0rd avatar May 06 '20 18:05 l0rd

From my understanding, we should convert the usage of tags to digests in:

  • the registry (devfile, plugin) meta.yaml's
  • Dockerfiles base images

Going forward then we should only accept contributions with digests used, no tags allowed. There may even be a PR check that could be triggered which will use a regex to automatically validate that no tags have been used in the PR being submitted.

ericwill avatar May 06 '20 18:05 ericwill

@ericwill that's fine. What about the use of next version of theia? Are we going to replace next here with a digest or should we consider it as an exception?

l0rd avatar May 06 '20 21:05 l0rd

@RickJWagner it depends on the image. If it's an update of a server image (operator, che server, registries etc...) those are updated without any manual intervention. If it's an update of a workspace image (theia, machine-exec, sidecars etc...) then existing workspaces that include that image need to be manually migrated. Existing workspaces only. New workspaces will automatically include the new patched image.

l0rd avatar May 06 '20 21:05 l0rd

@RickJWagner but in any case we need to release a new version of the operator first. Otherwise no update will be triggered.

l0rd avatar May 06 '20 21:05 l0rd

@ericwill that's fine. What about the use of next version of theia? Are we going to replace next here with a digest or should we consider it as an exception?

I think it makes sense to handle this as an exception, since we are the ones controlling the release of che-theia, and the releases of the registries are sync'ed with the releases of che-theia.

ericwill avatar May 07 '20 18:05 ericwill

Another option here, somewhat in line with Nick's suggestion, is to fix images via digest for releases and leave tags in the master branch. Once we release e.g. 7.15.0 for the registries, we run the write_image_digests.sh script and that release will be a reproducible build, without the manual work of maintaining 'freshness' of the digests.

amisevsk avatar May 12 '20 21:05 amisevsk

Maybe I'm missing the point here, but wouldn't that still leave us with the problem of images "silently" changing between releases? For example: if a change is pushed to a :nightly image over the course of a sprint, then automatically updating that image at release time means a potentially breaking change is stuffed into the next release.

IMO we should heavily automate testing of these images, and have dependabot style automation for keeping them up-to-date. That way we aren't blindly updating images in a script, yet at the same time reducing our time spent maintaining these images. The added benefit here is that there is a clear trail of issues/PRs showing what was changed, so in case something breaks it's easier to find the breaking commit.

The flow would be:

  1. Dependabot checks images for updates, opens issues/PRs for updates.
  2. Strong automated testing of the image is triggered via CI.
  3. A quick manual verification/smoke test is conducted before merging.

ericwill avatar May 13 '20 11:05 ericwill

Part of the issue here is that if we can implement

Strong automated testing of the image is triggered via CI.

that's all the issue boils down to. If we intend automated jobs to catch breaking changes, then it doesn't really matter all that much, does it?

  • A) Keep nightly tag, use automated tests to test images. If a breaking change is pushed, CI fails and we do the work to resolve the issue.
  • B) Use digests, CI triggers PRs on update that are tested via automated tests. If tests fail, we do the work to resolve the issue.

The same image is published at release time in both cases, except that case b) will require manual verification of every change and be a significant manual burden until automated tests are in a stable place.

I agree that FROM images should be fixed by digest, I'm referring more to fixing digests for every image in the registries themselves.

amisevsk avatar May 13 '20 21:05 amisevsk

Okay I think we're talking about the same thing. My main concern are the 14 base images in the devfile registry. This is where "silent changes" can really cause problems for us.

The other images in the plugin registry aren't a high priority right now, because most of them are already using "digest style" tags in the format <IMAGE_NAME>:<VERSION>-<COMMIT ID>, so it's impossible for silent changes to happen as those images are only ever pushed once.

What I propose as an immediate action:

  1. Convert the base images in the devfile registry to digests
  2. Introduce dependabot automation for said images, so we get free PR's and can test each change easily. There aren't many images here anyway so it should be manageable.
  3. Improve automated tests for the plugin/devfile registries (already in progress)
  4. Compile a list of plugins not using the "digest style" tag format and convert them.

We could probably (just a thought) handle the plugin registry via attrition actually, and just do digests going forward as plugins become updated (those have to be done manually anyway, for now).

ericwill avatar May 14 '20 23:05 ericwill

That makes sense :+1:, sorry I misunderstood the context of the discussion.

amisevsk avatar May 15 '20 22:05 amisevsk

Issues go stale after 180 days of inactivity. lifecycle/stale issues rot after an additional 7 days of inactivity and eventually close.

Mark the issue as fresh with /remove-lifecycle stale in a new comment.

If this issue is safe to close now please do so.

Moderators: Add lifecycle/frozen label to avoid stale mode.

che-bot avatar Feb 04 '21 14:02 che-bot

Plugin registry is done.

ericwill avatar Mar 17 '21 19:03 ericwill

Devfile registry is also done.

ericwill avatar Jun 18 '21 20:06 ericwill