image-automation-controller icon indicating copy to clipboard operation
image-automation-controller copied to clipboard

[FEATURE] ImageUpdateAutomation: compare image tag timestamp before override

Open leszczynskimikolaj opened this issue 2 years ago • 5 comments

Problem

I would like to do the image tag substitution if and only if the timestamp of the newly discovered image is greater than the one that is found in yaml spec. I'm aware that the the ImagePolicies and ImageRepository can be configured to sort the tags in asc/desc numerical order but in my case, it is not enough and results in substitutions I don't want to be applied.

Question

Would it be possible to add an extra verification logic in place where we do the image substitution? I believe the actual setter logic is applied here.

leszczynskimikolaj avatar May 16 '22 16:05 leszczynskimikolaj

Is it possible for you to tag the images with the timestamp?

The main complicating factor of implementing your request I think would be that Flux doesn't really store the image metadata for any images now. We depend on the tag to be sortable, as described in Sortable Image Tags because pulling the metadata is expensive, and pulling the timestamp can be rate limited.

The last example here:

https://fluxcd.io/docs/components/image/imagepolicies/#examples

is based on the tagging strategy used here (RFC-3339): https://hub.docker.com/r/minio/minio

Does this help?

kingdonb avatar May 18 '22 12:05 kingdonb

@kingdonb I indeed tag my images with the timestamp and use the sorting provided by ImagePolicies. The case is that from time to time I also make a manual change on the marked line with the image I build in my pipeline. Then after a moment, this change gets overriden by the IAC with the image whose tag is indeed the latest one according to my filterPattern policy but its timestamp is smaller than the timestamp of the image I put manually. Sorry if this seems confusing.

The main complicating factor of implementing your request I think would be that Flux doesn't really store the image metadata for any images now. We depend on the tag to be sortable, as described in Sortable Image Tags because pulling the metadata is expensive, and pulling the timestamp can be rate limited.

Hmm, I don't think we need to fetch metadata. We can just read the marked line, parse it to get the timestamp, compare that timestamp with the timestamp of the new value that came and only substitute if the new timestamp is greater than old one. Am I missing something?

leszczynskimikolaj avatar May 18 '22 13:05 leszczynskimikolaj

if and only if the timestamp of the newly discovered image is greater than the one that is found in yaml spec

Very sorry I believe I misinterpreted your description here. Flux doesn't consider the image definition or the tag that is present in the cluster or in the git repository prior to reconciling the ImageUpdateAutomation, it is simply asserting the current latest version into the manifest.

If you are saying that you had a newer image in the cluster but Flux overwrites it with an older one, there are a few ways that I can think of this might happen, but in general it shouldn't be possible unless you are doing something odd.

The only way I can see that happening is if you are writing commits manually that IUA would have written sooner or later but you wrote them ahead because ImageUpdateAutomation is too slow. There are other mitigations for the slowness, you can increase the interval on ImageRepository without much performance impact for example.

I'm trying to get to the bottom of exactly what's happening here because this seems like it may be an "X-Y" problem. As in "I'm asking for X because I don't know that Flux can do Y" – to go with that information, you can also use a Receiver webhook through the notification-controller so that ImageRepository updates are processed immediately, if your image registry provider supports webhook delivery (which most should be supported by Flux!)

Could you provide some clarity about when Flux is overwriting your newer image tag commits with timestamps belonging to older-timestamped image tags? Because in general I think this should not be possible, except for some weird edge cases; I'd rather not go through my list of guesses but just ask for more details instead, maybe you can clarify exactly when this is happening, so I can help find the best approach to mitigate it.

Taking the prior tag content into consideration would be a major design change that would need to go through an RFC process and is unlikely to be accepted without strong justification. I don't understand how this issue bit you yet well enough to recommend such a change myself as it seems like it would be significantly more complicated comparing and parsing than just asserting the latest tag into the field like Flux does now.

kingdonb avatar May 19 '22 20:05 kingdonb

@kingdonb thank you so much for your answer. Let me give you more detailed explanation of more less my case.

  1. I have the image policy that is looking for all images with my naming schema of <service name>-<branch name>-<commit sha>-<timestamp>.
  2. IUA controller works great.
  3. Let's assume the latest tag found(and the substituted to my yaml) by IAU is service-main-sha-123.
  4. I use the timestamp extract in asc numerical order.
  5. Now I build a new image called e.g. manual-service-main-sha-124 and make a commit manually to the file.
  6. Then IAU comes after a moment and it overrides this image with service-main-sha-123 which is an older image but it is the one that satisfies the ImagePolicy.

That is why I thought about comparing timestamps. Is this now clear @kingdonb ?

leszczynskimikolaj avatar May 19 '22 20:05 leszczynskimikolaj

@leszczynskimikolaj Yes perfectly clear.

I would suggest rather than writing a commit manually ahead of IUA, you should set up a Webhook Receiver for your ImageRepository. This will ensure it reconciles immediately after a tag is pushed to the Image Repo. It is as I suspected – the ImageUpdateAutomation interval arrives before the ImageRepository interval does, and when it reconciles, it only reconciles itself – nothing reconciles "through" so the ImageRepo at this point may be stale, and this edge case is where you've described it.

I'm not certain if this behavior will be considered a bug or UX issue, or working as designed, but I have some docs to support this suggestion. From the Image Update Guide:

https://fluxcd.io/docs/guides/image-update/#trigger-image-updates-with-webhooks

You may want to trigger a deployment as soon as a new image tag is pushed to your container registry. In order to notify the image-reflector-controller about new images, you can setup webhook receivers.

Another alternative, instead of making the commit and pushing it (in case there's some reason in your environment that you cannot add a webhook, perhaps there is no access allowed to the cluster from outside) is to reconcile the Image Repository. You should get the same commit that you are looking for by running flux reconcile image repository <name>

Internally, Flux resources that connect to each other all communicate via Kubernetes API object subscription. If a GitRepository is referenced by a Kustomization in the spec.sourceRef then Kustomization will be notified and reconciled immediately as soon as the GitRepository is updated and a new artifact is ready to fetch from its service endpoint.

The same relationship exists between ImageUpdateAutomation and ImageRepository resources (via ImagePolicy) – when the ImageRepository receives its update, any ImagePolicy that it has attached will automatically receive an event as subscription and it will trigger any ImageUpdateAutomation in its namespace to reconcile immediately.

Webhook Receivers are different than the internal behavior because the physical resources (git repository, image repository, etc.) are actually located outside of Kubernetes and do not fall within the K8S API's scope of control, so they cannot be set implicitly through the API. So in order to get this subscription behavior, you have to set it up according to these guides.

If you didn't know about these features and now you do, will this perhaps make your life easier if you don't have to worry about pushing commits manually in any case?

kingdonb avatar May 20 '22 15:05 kingdonb