skaffold
skaffold copied to clipboard
Skaffold waits indefinitely for pod to stabilize although it is already running
Expected behavior
Skaffold stabilize works when a Pod is deployed
Actual behavior
Skaffold waits endlessly for stabilizing until the timeout of 10 minutes is reached and the deployments is canceled
Information
- Skaffold version: 1.39.1
- Operating system: MacOs 12.5.1
- Installed via: Homebrew
- Contents of skaffold.yaml:
apiVersion: skaffold/v2beta24
kind: Config
deploy:
kustomize:
paths:
- .
kubeContext: docker-desktop
- Contents of kustomization.yaml:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- deployment.yaml
- Contents of deployment.yaml:
apiVersion: v1
kind: Pod
metadata:
name: toolbox
spec:
containers:
- name: toolbox
image: alpine:latest
imagePullPolicy: IfNotPresent
command: [ "tail", "-f", "/dev/null" ] # let the container running in the background without doing something
terminationGracePeriodSeconds: 0 # terminate instantly, as tail does not exit, it would otherwhise wait 30 seconds for termination
Steps to reproduce the behavior
- clone https://github.com/DerGary/skaffold-podbug-example
- run
skaffold dev
Logs
- Skaffold without debug:
Listing files to watch...
Generating tags...
Checking cache...
Tags used in deployment:
Starting deploy...
- pod/toolbox created
Waiting for deployments to stabilize...
- pods: could not stabilize within 10m0s
- pods failed. Error: could not stabilize within 10m0s.
Cleaning up...
- pod "toolbox" deleted
1/1 deployment(s) failed
- kubectl describe pod toolbox:
Name: toolbox
Namespace: default
Priority: 0
Node: docker-desktop/192.168.65.4
Start Time: Wed, 31 Aug 2022 09:53:37 +0200
Labels: skaffold.dev/run-id=c55e0baf-ebb7-4b2c-aa6e-ddb580c0207f
Annotations: <none>
Status: Running
IP: 10.1.16.235
IPs:
IP: 10.1.16.235
Containers:
toolbox:
Container ID: docker://2d66b6508d6406f449c3e27d7c1b058100ab1e7bc15ccd993355a90bd59b875f
Image: alpine:latest
Image ID: docker-pullable://alpine@sha256:bc41182d7ef5ffc53a40b044e725193bc10142a1243f395ee852a8d9730fc2ad
Port: <none>
Host Port: <none>
Command:
tail
-f
/dev/null
State: Running
Started: Wed, 31 Aug 2022 09:53:38 +0200
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fbscl (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-fbscl:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 38s default-scheduler Successfully assigned default/toolbox to docker-desktop
Normal Pulled 38s kubelet Container image "alpine:latest" already present on machine
Normal Created 38s kubelet Created container toolbox
Normal Started 38s kubelet Started container toolbox
- skaffold dev -vdebug
DEBU[0000] skaffold API not starting as it's not requested subtask=-1 task=DevLoop
INFO[0000] Skaffold &{Version:v1.39.1 ConfigVersion:skaffold/v2beta29 GitVersion: GitCommit:cd3f6fa3231ae8abf7f028eb7163d74aafd6ae94 BuildDate:2022-06-25T00:11:50Z GoVersion:go1.17.11 Compiler:gc Platform:darwin/amd64 User:} subtask=-1 task=DevLoop
INFO[0000] Loaded Skaffold defaults from "/Users/macdev/.skaffold/config" subtask=-1 task=DevLoop
DEBU[0000] config version out of date: upgrading to latest "skaffold/v2beta29" subtask=-1 task=DevLoop
DEBU[0000] parsed 1 configs from configuration file /Users/macdev/Documents/git/skaffold-bug-example/skaffold.yaml subtask=-1 task=DevLoop
DEBU[0000] Defaulting build type to local build subtask=-1 task=DevLoop
INFO[0000] Using kubectl context: docker-desktop subtask=-1 task=DevLoop
DEBU[0000] getting client config for kubeContext: `docker-desktop` subtask=-1 task=DevLoop
DEBU[0000] Running command: [minikube version --output=json] subtask=-1 task=DevLoop
DEBU[0000] setting Docker user agent to skaffold-v1.39.1 subtask=-1 task=DevLoop
DEBU[0000] CLI platforms provided: "" subtask=-1 task=DevLoop
DEBU[0000] getting client config for kubeContext: `docker-desktop` subtask=-1 task=DevLoop
DEBU[0000] platforms detected from active kubernetes cluster nodes: "linux/amd64" subtask=-1 task=DevLoop
DEBU[0000] Using builder: local subtask=-1 task=DevLoop
DEBU[0000] push value not present in NewBuilder, defaulting to false because cluster.PushImages is false subtask=-1 task=DevLoop
INFO[0000] build concurrency first set to 1 parsed from *local.Builder[0] subtask=-1 task=DevLoop
INFO[0000] final build concurrency value is 1 subtask=-1 task=DevLoop
Listing files to watch...
DEBU[0000] Executing template &{envTemplate 0xc0000fb8c0 0xc000d8caa0 } with environment map[COLORTERM:truecolor COMMAND_MODE:unix2003 GIT_ASKPASS:/Applications/Visual Studio Code.app/Contents/Resources/app/extensions/git/dist/askpass.sh HOME:/Users/macdev LANG:en_GB.UTF-8 LESS:-R LOGNAME:macdev LSCOLORS:Gxfxcxdxbxegedabagacad MallocNanoZone:0 OLDPWD:/Users/macdev/Documents/git/skaffold-bug-example ORIGINAL_XDG_CURRENT_DESKTOP:undefined PAGER:less PATH:/Users/macdev/.rd/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/share/dotnet:~/.dotnet/tools:/Library/Apple/usr/bin:/Library/Frameworks/Mono.framework/Versions/Current/Commands:/Users/macdev/.rd/bin PWD:/Users/macdev/Documents/git/skaffold-bug-example SHELL:/bin/zsh SHLVL:1 SSH_AUTH_SOCK:/private/tmp/com.apple.launchd.z0QW2TB2yK/Listeners TERM:xterm-256color TERM_PROGRAM:vscode TERM_PROGRAM_VERSION:1.70.2 TMPDIR:/var/folders/gq/26phy7rj5kv88dchtpn9c_5w0000gp/T/ USER:macdev VSCODE_GIT_ASKPASS_EXTRA_ARGS:--ms-enable-electron-run-as-node VSCODE_GIT_ASKPASS_MAIN:/Applications/Visual Studio Code.app/Contents/Resources/app/extensions/git/dist/askpass-main.js VSCODE_GIT_ASKPASS_NODE:/Applications/Visual Studio Code.app/Contents/Frameworks/Code Helper.app/Contents/MacOS/Code Helper VSCODE_GIT_IPC_HANDLE:/var/folders/gq/26phy7rj5kv88dchtpn9c_5w0000gp/T/vscode-git-2bf68283ae.sock VSCODE_INJECTION:1 XPC_FLAGS:0x0 XPC_SERVICE_NAME:0 ZDOTDIR:/var/folders/gq/26phy7rj5kv88dchtpn9c_5w0000gp/T/vscode-zsh ZSH:/Users/macdev/.oh-my-zsh _:/usr/local/bin/skaffold __CFBundleIdentifier:com.microsoft.VSCode __CF_USER_TEXT_ENCODING:0x1F6:0x0:0x0] subtask=-1 task=DevLoop
INFO[0000] List generated in 1.110424ms subtask=-1 task=DevLoop
Generating tags...
INFO[0000] Tags generated in 34.875µs subtask=-1 task=Build
Checking cache...
INFO[0000] Cache check completed in 168.349µs subtask=-1 task=Build
Tags used in deployment:
Starting deploy...
DEBU[0000] getting client config for kubeContext: `docker-desktop` subtask=-1 task=DevLoop
DEBU[0000] Running command: [kubectl version --client -ojson] subtask=0 task=Deploy
DEBU[0000] Command output: [{
"clientVersion": {
"major": "1",
"minor": "21",
"gitVersion": "v1.21.3",
"gitCommit": "ca643a4d1f7bfe34773c74f79527be4afd95bf39",
"gitTreeState": "clean",
"buildDate": "2021-07-15T21:04:39Z",
"goVersion": "go1.16.6",
"compiler": "gc",
"platform": "darwin/amd64"
}
}
] subtask=0 task=Deploy
DEBU[0000] Executing template &{envTemplate 0xc000a399e0 0xc00092f220 } with environment map[COLORTERM:truecolor COMMAND_MODE:unix2003 GIT_ASKPASS:/Applications/Visual Studio Code.app/Contents/Resources/app/extensions/git/dist/askpass.sh HOME:/Users/macdev LANG:en_GB.UTF-8 LESS:-R LOGNAME:macdev LSCOLORS:Gxfxcxdxbxegedabagacad MallocNanoZone:0 OLDPWD:/Users/macdev/Documents/git/skaffold-bug-example ORIGINAL_XDG_CURRENT_DESKTOP:undefined PAGER:less PATH:/Users/macdev/.rd/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/share/dotnet:~/.dotnet/tools:/Library/Apple/usr/bin:/Library/Frameworks/Mono.framework/Versions/Current/Commands:/Users/macdev/.rd/bin PWD:/Users/macdev/Documents/git/skaffold-bug-example SHELL:/bin/zsh SHLVL:1 SSH_AUTH_SOCK:/private/tmp/com.apple.launchd.z0QW2TB2yK/Listeners TERM:xterm-256color TERM_PROGRAM:vscode TERM_PROGRAM_VERSION:1.70.2 TMPDIR:/var/folders/gq/26phy7rj5kv88dchtpn9c_5w0000gp/T/ USER:macdev VSCODE_GIT_ASKPASS_EXTRA_ARGS:--ms-enable-electron-run-as-node VSCODE_GIT_ASKPASS_MAIN:/Applications/Visual Studio Code.app/Contents/Resources/app/extensions/git/dist/askpass-main.js VSCODE_GIT_ASKPASS_NODE:/Applications/Visual Studio Code.app/Contents/Frameworks/Code Helper.app/Contents/MacOS/Code Helper VSCODE_GIT_IPC_HANDLE:/var/folders/gq/26phy7rj5kv88dchtpn9c_5w0000gp/T/vscode-git-2bf68283ae.sock VSCODE_INJECTION:1 XPC_FLAGS:0x0 XPC_SERVICE_NAME:0 ZDOTDIR:/var/folders/gq/26phy7rj5kv88dchtpn9c_5w0000gp/T/vscode-zsh ZSH:/Users/macdev/.oh-my-zsh _:/usr/local/bin/skaffold __CFBundleIdentifier:com.microsoft.VSCode __CF_USER_TEXT_ENCODING:0x1F6:0x0:0x0] subtask=-1 task=DevLoop
DEBU[0000] Running command: [kustomize build .] subtask=0 task=Deploy
DEBU[0000] Command output: [apiVersion: v1
kind: Pod
metadata:
name: toolbox
spec:
containers:
- command:
- tail
- -f
- /dev/null
image: alpine:latest
imagePullPolicy: IfNotPresent
name: toolbox
terminationGracePeriodSeconds: 0
] subtask=0 task=Deploy
DEBU[0000] manifests with tagged images:apiVersion: v1
kind: Pod
metadata:
name: toolbox
spec:
containers:
- command:
- tail
- -f
- /dev/null
image: alpine:latest
imagePullPolicy: IfNotPresent
name: toolbox
terminationGracePeriodSeconds: 0 subtask=0 task=Deploy
DEBU[0000] manifests with labels apiVersion: v1
kind: Pod
metadata:
labels:
skaffold.dev/run-id: c6eec86e-e726-4183-af12-8d4d4e23f536
name: toolbox
spec:
containers:
- command:
- tail
- -f
- /dev/null
image: alpine:latest
imagePullPolicy: IfNotPresent
name: toolbox
terminationGracePeriodSeconds: 0 subtask=-1 task=DevLoop
DEBU[0000] Running command: [kubectl --context docker-desktop get -f - --ignore-not-found -ojson] subtask=0 task=Deploy
DEBU[0000] Command output: [] subtask=0 task=Deploy
DEBU[0000] 1 manifests to deploy. 1 are updated or new subtask=0 task=Deploy
DEBU[0000] Running command: [kubectl --context docker-desktop apply -f -] subtask=0 task=Deploy
- pod/toolbox created
INFO[0000] Deploy completed in 454.900172ms subtask=-1 task=Deploy
Waiting for deployments to stabilize...
DEBU[0000] getting client config for kubeContext: `docker-desktop` subtask=-1 task=DevLoop
DEBU[0000] getting client config for kubeContext: `docker-desktop` subtask=-1 task=DevLoop
DEBU[0000] checking status pods subtask=-1 task=Deploy
DEBU[0601] marking resource failed due to error code STATUSCHECK_DEADLINE_EXCEEDED subtask=-1 task=Deploy
- pods: could not stabilize within 10m0s
- pods failed. Error: could not stabilize within 10m0s.
DEBU[0601] setting skaffold deploy status to STATUSCHECK_DEADLINE_EXCEEDED. subtask=-1 task=Deploy
Cleaning up...
DEBU[0601] Executing template &{envTemplate 0xc000bc70e0 0xc000d8f630 } with environment map[COLORTERM:truecolor COMMAND_MODE:unix2003 GIT_ASKPASS:/Applications/Visual Studio Code.app/Contents/Resources/app/extensions/git/dist/askpass.sh HOME:/Users/macdev LANG:en_GB.UTF-8 LESS:-R LOGNAME:macdev LSCOLORS:Gxfxcxdxbxegedabagacad MallocNanoZone:0 OLDPWD:/Users/macdev/Documents/git/skaffold-bug-example ORIGINAL_XDG_CURRENT_DESKTOP:undefined PAGER:less PATH:/Users/macdev/.rd/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/share/dotnet:~/.dotnet/tools:/Library/Apple/usr/bin:/Library/Frameworks/Mono.framework/Versions/Current/Commands:/Users/macdev/.rd/bin PWD:/Users/macdev/Documents/git/skaffold-bug-example SHELL:/bin/zsh SHLVL:1 SSH_AUTH_SOCK:/private/tmp/com.apple.launchd.z0QW2TB2yK/Listeners TERM:xterm-256color TERM_PROGRAM:vscode TERM_PROGRAM_VERSION:1.70.2 TMPDIR:/var/folders/gq/26phy7rj5kv88dchtpn9c_5w0000gp/T/ USER:macdev VSCODE_GIT_ASKPASS_EXTRA_ARGS:--ms-enable-electron-run-as-node VSCODE_GIT_ASKPASS_MAIN:/Applications/Visual Studio Code.app/Contents/Resources/app/extensions/git/dist/askpass-main.js VSCODE_GIT_ASKPASS_NODE:/Applications/Visual Studio Code.app/Contents/Frameworks/Code Helper.app/Contents/MacOS/Code Helper VSCODE_GIT_IPC_HANDLE:/var/folders/gq/26phy7rj5kv88dchtpn9c_5w0000gp/T/vscode-git-2bf68283ae.sock VSCODE_INJECTION:1 XPC_FLAGS:0x0 XPC_SERVICE_NAME:0 ZDOTDIR:/var/folders/gq/26phy7rj5kv88dchtpn9c_5w0000gp/T/vscode-zsh ZSH:/Users/macdev/.oh-my-zsh _:/usr/local/bin/skaffold __CFBundleIdentifier:com.microsoft.VSCode __CF_USER_TEXT_ENCODING:0x1F6:0x0:0x0] subtask=-1 task=DevLoop
DEBU[0601] Running command: [kustomize build .] subtask=-1 task=DevLoop
DEBU[0601] Command output: [apiVersion: v1
kind: Pod
metadata:
name: toolbox
spec:
containers:
- command:
- tail
- -f
- /dev/null
image: alpine:latest
imagePullPolicy: IfNotPresent
name: toolbox
terminationGracePeriodSeconds: 0
] subtask=-1 task=DevLoop
DEBU[0601] Running command: [kubectl --context docker-desktop delete --ignore-not-found=true --wait=false -f -] subtask=-1 task=DevLoop
- pod "toolbox" deleted
INFO[0603] Cleanup completed in 2.164 seconds subtask=-1 task=DevLoop
DEBU[0603] Running command: [tput colors] subtask=-1 task=DevLoop
DEBU[0603] Command output: [256
] subtask=-1 task=DevLoop
1/1 deployment(s) failed
DEBU[0603] exporting metrics subtask=-1 task=DevLoop
I deleted kubeContext: docker-desktop
from skaffold.yaml as I don't have that k8s context. I cannot reproduce the issue after that.
I can reproduce it with the local docker-desktop kubernetes and with the rancher-desktop kubernetes. @ericzzzzzzz which kubernetes cluster are you using? Actually when I try a hosted context in Azure Kubernetes Service (AKS) then it works as expected.
@DerGary my bad.. I was assuming that you're using minikube
as kubernetes cluster for local development. Using minikube works for your test project on my machine, however I'm able to reproduce the issue with docker-desktop kubernetes cluster on machine now with 1.39.1. I'll look into that why this is happening.
Example/getting-started project works with docker-desktop cluster, the test project also works with this cluster if using binary built from main branch, this scenario seems a special case so mark this issue as p2 temporarily.
It's interesting that Example/getting-started project works. I looked deeper into that and I found the difference which makes it work / not work. With my example I can also get it to work when I change kustomize
to kubectl
in the skaffold.yaml
like this:
skaffold.yaml
apiVersion: skaffold/v2beta24
kind: Config
deploy:
kustomize:
paths:
- .
to:
apiVersion: skaffold/v2beta24
kind: Config
deploy:
kubectl:
manifests:
- deployment.yaml
But that is not really a solution for us at the moment. But maybe it helps finding the root cause?
I think I'm experiencing this same bug. I have two Skaffold projects, one that uses deploy.kubectl
, and one that uses deploy.kustomize
. I'm using Rancher Desktop and also skaffold dev
. The project using deploy.kubectl
deploys and stabilizes fine, and the project using deploy.kustomize
fails to stabilize, erroring with "[image] can't be pulled".
My hypothesis is that kustomize deployer doesn't correctly inject the local docker image (docker://...
). I may have more time to investigate later.
@DerGary @kevin-hanselman
Kustomize deployer reads k8s config to get default namespace to build status check monitor, can you try run kubectl config set-context --current --namespace=default
to set current namespace to default, then run your test case to see if it works?
@ericzzzzzzz I tried this, and it doesn't help. I don't think the issue is with the status monitor's namespace. I think k8s is trying to pull the image(s) from a remote registry when it shouldn't be; the image should be taken from the local Docker instance.
In the kubectl deploy case (i.e. the working case), in the server-side Pod YAML, under status.containerStatuses
, imageID
is set to docker://...
.
In the kustomize deploy case (i.e. the broken case), in the server-side Pod YAML, under status.containerStatuses
, imageID
is empty.
To reiterate and clarify: I am running Rancher Desktop, and I have configured Skaffold to recognize it as a local cluster:
$ skaffold config list
skaffold config:
kube-context: rancher-desktop
local-cluster: true
Hey @kevin-hanselman, the issue you're experiencing is different from @DerGary's, In his case:
- The container is already running, it's just Skaffold cannot get the status from his cluster.
your case:
- image cannot be pulled from registry, if deploying something to rancher-desktop cluster with a kustomize deployer
If this is true, please open another issue, it would be great if you can provide more details for us to reproduce the issue. Thanks .
@ericzzzzzzz Thanks for clarifying. I began working on a minimal reproducible example, and I found the source of the issue I'm experiencing. I had my Pod's container configured with imagePullPolicy: Always
, so k8s will always try to pull it from a registry. This is my mistake. Sorry for adding noise to this issue.
@ericzzzzzzz
kubectl config set-context --current --namespace=default
I don't really grasp what this does because my namespace was default all along but after executing this, I can't reproduce the issue anymore. (I also upgraded skaffold to 1.39.2 since then, which shouldn't make a difference I think?)
@DerGary
You can reproduce the issue again by resetting your docker-desktop cluster(I wouldn't recommend to do this though) or run kubectl config set-context --current --namespace=''
to set your current namespace preference to an empty string. Then validate the namespace setup in your context config, by kubectl config view --minify |grep namespace:
no result should be found this time, then you can run your test project, the issue is still there even if you're with skaffold 1.39.2.
After testing this, you can reset the namespace to default kubectl config set-context --current --namespace=default
, run kubectl config view --minify |grep namespace:
then namespace: default
should be in the output. Now run skaffold dev
everything should be good.
The problem is that docker-desktop doesn't have namespace preference in config, and kustomize deployer will use empty string as namespace when doing status check for pods, however your apps are instead deployed to default namespace
, so status monitor keeps getting nothing from docker-desktop cluster and re-trying, hence getting timeout. And that's why manually set namespace can fix the issue.
Skaffold doesn't have this problem with main branch, the main branch is actually using kubectl as deployer to replace kustomize deployer when doing schema upgrading.
I found a similar behavior when I manually waited a rollout with a following code.
skaffold run --status-code=false
kubectl rollout status deployment/<app>
However, a following code doesn't have a problem.
# Deploy but don't wait for rollout first.
skaffold run --status-code=false
# Then deploy again.
skaffold run
On second run, I expected deployment.apps/<app> unchanged
, but it prints deployment.apps/<app> configured