skaffold
skaffold copied to clipboard
skaffold dev should not fail on first failed build / deploy
I propose that starting skaffold dev should start the dev loop, even if the first build or deployment fails.
I see no additional value in keeping the first build failed. In the end what will the developer do to improve the situation and "finally start 'dev'-ing"? They will fix the errors and rerun skaffold dev. Why can't they do that while it's still running and get feedback faster?
I argue that we can remove this artificial need for an extra skaffold dev command. If the user wants to stop skaffold dev-ing as they realize that it will take longer for them to fix things up, they can always Ctrl+C out of skaffold dev.
I intend to work on this this week for the fixit effort - please comment if you feel strongly against it.
More context on this: #516 Also, failure might be due to infrastructure issues, not just application related ones, for example Docker is not installed, or the user is pointed to the wrong kubecontext, etc. The original motivation was around that - we didn't stop, just warned at build errors and that created all sorts of messes, but today we do stop at failed builds.
As I'm thinking about this, maybe it is a good idea to fail the first iteration on infrastructure errors. This would require us to start to differentiate a bit more smarter around error types, namely infra vs application errors - which is actually something @tejal29 started already.
I realized that this is a bit more subtle: one issue that we'll have to resolve if we go down this route: currently skaffold assumes that everything was built & deployed once at least as a baseline, then every file change triggers only an incremental change on top of that.
Now, if we want to change this behavior, that means that we'll need to keep track of which artifacts haven't been built yet. Skaffold shouldn't deploy before all artifacts are built.
More context on this: #516 Also, failure might be due to infrastructure issues, not just application related ones, for example Docker is not installed, or the user is pointed to the wrong kubecontext, etc. The original motivation was around that - we didn't stop, just warned at build errors and that created all sorts of messes, but today we do stop at failed builds.
@balopat there might be ImagePullBackOff errors too.
i've to run skaffold dev --no-prune=true --cleanup=false when developing on local.
in prototype stage, i intend to keep images on local, do not push to any registry. no need to prune and cleanup
maybe just set default value for --no-prune and --cleanup to true, false
or save these configs to ~/.skaffold/config like this:
global:
survey:
last-prompted: "2020-10-19T20:10:03+08:00"
kubeContexts: []
dev:
no-prune: true
cleanup: false
From #4953, we should provide a flag to configure this behavior.
+1 to configurable flag
We are currently running into this issue as well.
Many of our pods depend on Postgres DB being ready, and were designed to fail so they can be restarted.
The current skaffold dev behavior of terminating all pods on failure makes it unusable for us.
We're also in strong need for this feature. It would make test driven development way more easy.
Recently a new feature enabled via --tolerate-failures-until-deadline and deploy.tolerateFailuresUntilDeadline=true (in skaffold.yaml) config allows for dev (as well as run, apply, and deploy) to not fail when a deployment encounters an error but instead keep polling for success until statusCheckDeadlineSeconds or the k8s object controllers own timeout (eg: deployments -> progressDeadlineSeconds). Using this flag should at least help solve the first failed deploy part of this for users here