argocd-image-updater icon indicating copy to clipboard operation
argocd-image-updater copied to clipboard

Not coming back from scale-to-zero (KEDA) with latest image

Open black-snow opened this issue 1 year ago • 2 comments

Describe the bug Not quite sure this is a bug, really - perhaps rather unexpected but obvious if you know anything about how image-updater works. I'll report it anyway because it surprised me - went against my naive expectation of what should have happened.

I have some deployments that scale to zero. I run KEDA with some trigger. So there's always a deployment and a ReplicaSet but the pod only gets spawned when there's actual work to do.

Now what I noticed is that when I push a new image version into the registry and spawn a pod it will be scheduled with the digest it was last scheduled with. It will then run for a moment, get killed, and be replaced with the new image version.

I have set:

argocd-image-updater.argoproj.io/backend.update-strategy: digest
argocd-image-updater.argoproj.io/write-back-method: git

I pretty much get why this happens, I think, and I do see the write-backs happen in my helm repo, but I wish the behaviour was different. With always-on workloads the delay between registry push and image update in k8s is quite short, but in this scenario here it can be quite surprising. k8s might schedule a pod that is way outdated. Best thing to happen probably is that it just crashes because it's too old - worst case is that it does wrong and unexpected things until it gets killed :/

To Reproduce

  • have a workload that scales to zero replicas (ain't got to be KEDA - you can also do this manually)
  • set update strategy and write-back
  • push a new image version
  • get replicas to >0 (via the trigger or manually)
  • watch the scheduled pod - the digest is not the newest one

Expected behavior I'd want image-updater to not lag behind for scaled-down deployments. It would be nice if it looked at the deployment or rs as they are always there.

Additional context Does imagePullPolicy: Always "fix" this issue? Probably. Is there currently a better way to achieve this?

Version 0.11.0

Logs N/A

black-snow avatar Oct 01 '24 21:10 black-snow