devspace
devspace copied to clipboard
Sync not starting due health checks on devspace v6
What happened?
After upgrading from devspace v5 to v6 the devspace dev
is no longer working.
It seems like a "A chicken and egg situation". The containers no longer starting after re-building the image with the new restart-helper.
The container waits for the initital sync of the files now but the problem is, as long the process inside the container does not start devspace waits to begin with the sync due the container state is "Unhealthy" due the health checks.
Container started with restart helper.
Waiting for initial sync to complete or file /.devspace/start to exist before starting the application...
(Still waiting...)
(Still waiting...)
(Still waiting...)
But
dev:sync-6 Pod container-statefulset-0: Readiness probe failed: dial tcp 10.131.2.33:1337: connect: connection refused (Unhealthy)
What did you expect to happen instead?
Starting the application immediately before waiting for the initial sync as we had on devspace v5 to pass the container health checks.
Local Environment:
- DevSpace Version: 6.1.1
- Operating System: mac
- ARCH of the OS: ARM64 Kubernetes Cluster:
- Cloud Provider: other
- Kubernetes Version:
Client Version: v1.25.0
Kustomize Version: v4.5.7
Server Version: v1.23.5+012e945
Workaround:
Added following line to the Dockerfiles:
RUN mkdir -p /.devspace/ && touch /.devspace/start
/kind bug
@BirknerAlex Thanks for reporting this. It seems the readiness / liveness probes are not being removed. Could you provide more details of your devspace.yaml
configuration?
Typically if you're using the restartHelper
configuration, the readiness / liveness probes would replaced automatically.
Maybe it's an bug in v6.1.1 when still using the config version v1beta11
. The config wasn't changed until now and is the same version as on devspace v5 we used before. Maybe the health-check removal is not applied when using the old config format?
I struggled with upgrading the format to the new v2beta1
version since our config was too huge to do an upgrade without breaking everything, was on my roadmap for the next weeks.
I have the same issue. As a minimal example, here are my devspace.yaml
and Dockerfile
:
version: v1beta11
images:
server:
createPullSecret: false
image: dockerregistry.example.com/foo/server
injectRestartHelper: true
tags:
- dev-${DEVSPACE_USERNAME}-${DEVSPACE_RANDOM}
deployments:
- name: to-devacct-cluster
kubectl:
replaceImageTags: true
kustomize: true
manifests:
- manifests/overlays/development/devspace
dev:
ports:
- imageSelector: image(server):tag(server)
forward:
- port: 8080
FROM golang:1.18-alpine
ENV CGO_ENABLED=0
RUN go install github.com/go-delve/delve/cmd/[email protected]
COPY main.go /app/
WORKDIR /app/
ENV GO111MODULE=off
RUN go build -a -o /app/build/server main.go
CMD [ "dlv", "debug", "--accept-multiclient", "--continue", "--headless", "--listen=:2345", "--api-version=2", "--log", "--wd=/"]
This works fine with devspace v5.18.5, but fails with v6.1.1 (and v6.1.0)
I tried v6.0.0 as well, but that had a completely unrelated issue (Error during image registry authentication: invalid reference format
)
By the way, removing the health checks on demand would cause some other issues, e.g. we currently have dev tooling scripts that are waiting for a pod restarts and for the application is available again, if devspace v6 now would remove that health checks on deployment it would cause this script to break.
I would personally recommend to have the same behaviour as devspace v5 did -> maybe an optional setting to prevent the restart helper waiting for the initial sync before starting the container.
Speaking of health checks... I overlay them in my manifests, could that be the issue?
kustomization.yaml
:
(...)
patchesJson6902:
- target:
group: apps
version: v1
kind: Deployment
name: foo
path: probepatch.yaml
probepatch.yaml
:
- op: replace
path: /spec/template/spec/containers/0/livenessProbe/periodSeconds
value: 120
- op: replace
path: /spec/template/spec/containers/0/readinessProbe/periodSeconds
value: 60
@BirknerAlex @ComaVN I'm looking into this deeper and was wondering if either of you have tried the delay container start example with the the readiness / liveness probes in place?
@BirknerAlex waitInitialSync: false may be the missing configuration that will help with this.
@lizardruss I have not migrated yet to the new config format. But I will do that soonish, the property does not exist in v1beta11
. Seems like the default behavior changed for the legacy config format in devspace v6.
waitInitialSync
false
or true
makes no difference in a v1beta11
config.
The problem I have, is that I have hundreds of projects with a devspace 5/v1beta11
config, and I would love to be able to gradually migrate them to devspace 6/v2beta1
. So, I basically need a v1beta11
config that works with both devspace 5 and 6, but that seems impossible :(
Did anybody have time to check this? It's the last piece we need to use devspace without breaking our current workflow.
DevSpace version : 6.2.5 Config version : v2beta1
Os: MacOS Ventura 13.1 Arch: arm64
Please tell me if you need more infos
edit: the mkdir -p /.devspace/ && touch /.devspace/start workaround does not work for me
also our liveness and readiness probes are done through an executable (grpc_health_probe) I dont know if thats a pb or not
Also my projects and their workflows are affected by this behaviour, so...news on this issue?
@zeno-mioso Could you run with the --debug
flag and share the output? Providing the devspace.yaml
, Dockerfile
and the rendered Deployment
(can run devspace deploy --render
) would help us reproduce the issue.
Can't help with that anymore because I migrated all my configs in the meantime to the new format but I think it has something todo with the parser of the old format where the following "new" setting is interpreted as false
which should be true
to keep it compatible:
The v2 option that I think is the reason while using the v1 format:
deployments:
<app>:
sync:
- startContainer: true
@a-candiotti-pvotal More information would be great. Would it be possible to set up an example repository that reproduces the issue?
@ComaVN Sorry for the slow reply. The devspace.yaml
you shared doesn't have a sync config. If this is still an issue for you, could you share the output?
When I try with a slightly modified configuration, here's the last lines of the output with DevSpace 6:
deploy:to-devacct-cluster Deploying chart /Users/russellcentanni/.devspace/component-chart/component-chart-0.8.5.tgz (to-devacct-cluster) with helm...
deploy:to-devacct-cluster Deployed helm chart (Release revision: 2)
deploy:to-devacct-cluster Successfully deployed to-devacct-cluster with helm
dev:ports-0 Waiting for pod to become ready...
dev:ports-0 Selected pod to-devacct-cluster-78cff9c6f8-rv596
dev:ports-0 ports Port forwarding started on: 8080 -> 8080
This is expected without sync, logs, or terminal configuration. My modified devspace.yaml
is below. I replaced the kustomize deployment with helm, and removed the ${DEVSPACE_USERNAME}
usage.
version: v1beta11
images:
server:
createPullSecret: false
image: dockerregistry.example.com/foo/server
injectRestartHelper: true
tags:
- dev-${DEVSPACE_RANDOM}
deployments:
- name: to-devacct-cluster
helm:
replaceImageTags: true
values:
containers:
- image: dockerregistry.example.com/foo/server
dev:
ports:
- imageSelector: image(server):tag(server)
forward:
- port: 8080
I've run this with Devspace 5.18.5, and it looks like the main difference is that DevSpace 5 would automatically log the container output, whereas DevSpace 6 does not:
[0:ports] Port-Forwarding: Waiting for containers to start...
[0:ports] Port forwarding started on 8080:8080 (multi-sync/to-devacct-cluster-6458599cdd-66sr8)
[info] Starting log streaming
[to-devacct-cluster] Start streaming logs for multi-sync/to-devacct-cluster-6458599cdd-66sr8/container-0
[to-devacct-cluster] API server listening at: [::]:2345
[to-devacct-cluster] 2023-03-21T15:49:37Z warning layer=rpc Listening for remote connections (connections are not authenticated nor encrypted)
[to-devacct-cluster] 2023-03-21T15:49:37Z info layer=debugger launching process with args: [/app/__debug_bin]
[to-devacct-cluster] 2023-03-21T15:49:37Z debug layer=debugger continuing
[to-devacct-cluster] Hello, World!
[to-devacct-cluster] 2023-03-21T15:49:37Z error layer=rpc writing response:write tcp 127.0.0.1:2345->127.0.0.1:60840: use of closed network connection
So, I've added a logs configuration to it, which I think finally reproduces the issue:
dev:
logs:
selectors:
- imageSelector: image(server):tag(server)
Output:
dev:ports-0 logs Container started with restart helper.
dev:ports-0 logs Waiting for initial sync to complete or file /.devspace/start to exist before starting the application...
dev:ports-0 logs (Still waiting...)
dev:ports-0 logs (Still waiting...)
dev:ports-0 logs (Still waiting...)
dev:ports-0 logs (Still waiting...)
I will look into resolving this, though it may be different from the original issue.
I can confirm that #2608 fixes this :+1: