skaffold icon indicating copy to clipboard operation
skaffold copied to clipboard

Skaffold run is not idempotent anymore

Open balopat opened this issue 6 years ago • 18 comments

Since runId was introduced now k8s resources are recreated. We should think about carefully how to enable idempotent behavior again.

balopat avatar Aug 28 '19 17:08 balopat

Idempotency is crucial if skaffold run is used in a git ops environment or if causing downtime is a no-go.

It's also important that running skaffold dev doesn't take what's already running down because it consumes time and can make the user think that something is broken on their code.

dgageot avatar Aug 28 '19 17:08 dgageot

I'm quite surprised to see this issue as "closed".

skaffold run is not idempotent, there is just a workaround to make it possible with a flag that must be set carefully by user.

Right now this behavior is dropping all my statefulset at each run... not really a good default behavior...

guilhem avatar Oct 23 '19 12:10 guilhem

Hello!

The issue with the Skaffold labels is still present. It breaks the semantic of kubectl apply and is a show stopper of using Skaffold for deploying. It makes no sense that metadata of a tool used in the deployment process triggers a restart of the pods.

Maybe annotations instead of labels would be the proper place to put this kind of information

Nuke1234 avatar Nov 26 '19 13:11 Nuke1234

Reopening issue.

tejal29 avatar Jun 19 '20 21:06 tejal29

@nkubala still on track for 1.16.0?

briandealwis avatar Oct 09 '20 02:10 briandealwis

mmmm i'll try and get my PR back open this week, otherwise i'll bump the milestone.

nkubala avatar Oct 14 '20 21:10 nkubala

Keeping the priority same though it's making triage unhappy.

tejal29 avatar Oct 28 '20 18:10 tejal29

same issue here with the runId causing problems with Jobs, and causing resources to be re-created when they shouldn't.

I haven't followed the whole story, but I'm curious to understand what the purpose of the runId is in the first place? it was added to solve some problem, but it's causing many others. The global override of the runId works, but then why even have a runId in the first place if it doesn't change?

Since when using dev or debug or run, what changes is the images defined within the skaffold.yaml (triggering an update of the tag) or the k8s manifests themselves, why do we even need a runId? If we do need it, then why can't it be applied conditionally? A skip option that lists resources that need to be left untouched sounds like it would be a pretty straight forward solution...

streamnsight avatar Feb 03 '21 04:02 streamnsight

Curious if there is a workaround to this issue other than setting a fixed skaffold.dev/run-id label?

SKAFFOLD_LABEL: skaffold.dev/run-id=PREVENT_REDEPLOY

This method prevents redeployment of unchanged deployments resources (which is good), but it seems to interfere with Skaffold stabilization mechanism (issue mentioned in #6758), where Skaffold can't distinguish new pods from the terminating ones.

ynouri avatar Dec 06 '21 15:12 ynouri

@ynouri can you please open a separate issue? Please make sure you are running the latest version of Skaffold as there was an issue where we were treating pods owned by a deployment as equivalent to standalone pods that was fixed by #6697 in Skaffold v1.33.0.

briandealwis avatar Dec 07 '21 19:12 briandealwis

I've tested with skaffold version 2.0 with Google Cloud build, the label skaffold.dev/run-id=xxx still appears in lables section rather than annotations. Here is what I got on the rendered deployment manifest:

  creationTimestamp: "2022-12-26T04:48:41Z"
  generation: 35
  labels:
    app.kubernetes.io/managed-by: google-cloud-deploy
    deploy.cloud.google.com/delivery-pipeline-id: ocr-pipeline-dev
    deploy.cloud.google.com/location: asia-east2
    deploy.cloud.google.com/project-id: prj-seab-svc-anthos-cicd-dev
    deploy.cloud.google.com/release-id: coreocr-20230223-0858
    deploy.cloud.google.com/target-id: ocr-target-dev
    skaffold.dev/run-id: e2b8c346-fcc9-4cc3-9473-7b424f77eca6
  name: seaops-backend-celery-beat
  namespace: coreocr
  resourceVersion: "65778114"

This will make all resources restart although they don't need

nguyen-viet-hung avatar Feb 23 '23 09:02 nguyen-viet-hung

is there any progress on this? skaffold is setting a new skaffold.dev/run-id every time in the dependency of my helm chart(https://charts.bitnami.com/bitnami/mongodb) and deployments, so every time is restarting the database and pods and causing application erros, thanks.

jearangoo avatar Oct 07 '23 00:10 jearangoo

@jearangoo As mentioned above, the work-around right now is to override the runId with some fixed value.

streamnsight avatar Oct 07 '23 16:10 streamnsight

set skaffold.dev/run-id: static doesn't help, skaffold continue recreate resources with dynamic id skaffold.dev/run-id: 81c688cd-72ea-4cc9-b949-fd23fda86487

skaffold version  
v2.9.0

azhurbilo avatar Jan 18 '24 10:01 azhurbilo

Setting the env like this works for me to keep kubernetes to create a new replica set. SKAFFOLD_LABEL=skaffold.dev/run-id=static But this leads to other problems. For example if you do a run with a crashing workload, the next skaffold run will not complete as it waits for the old workload to get a good status. Can we get attention to this?

joreetz-otto avatar Jun 26 '24 11:06 joreetz-otto