devspace icon indicating copy to clipboard operation
devspace copied to clipboard

Sync not starting due health checks on devspace v6

Open BirknerAlex opened this issue 2 years ago • 5 comments

What happened?

After upgrading from devspace v5 to v6 the devspace dev is no longer working.

It seems like a "A chicken and egg situation". The containers no longer starting after re-building the image with the new restart-helper.

The container waits for the initital sync of the files now but the problem is, as long the process inside the container does not start devspace waits to begin with the sync due the container state is "Unhealthy" due the health checks.

Container started with restart helper.
Waiting for initial sync to complete or file /.devspace/start to exist before starting the application...
(Still waiting...)
(Still waiting...)
(Still waiting...)

But

dev:sync-6 Pod container-statefulset-0: Readiness probe failed: dial tcp 10.131.2.33:1337: connect: connection refused (Unhealthy)

What did you expect to happen instead?

Starting the application immediately before waiting for the initial sync as we had on devspace v5 to pass the container health checks.

Local Environment:

  • DevSpace Version: 6.1.1
  • Operating System: mac
  • ARCH of the OS: ARM64 Kubernetes Cluster:
  • Cloud Provider: other
  • Kubernetes Version:
Client Version: v1.25.0
Kustomize Version: v4.5.7
Server Version: v1.23.5+012e945

Workaround:

Added following line to the Dockerfiles:

RUN mkdir -p /.devspace/ && touch /.devspace/start

/kind bug

BirknerAlex avatar Sep 15 '22 10:09 BirknerAlex

@BirknerAlex Thanks for reporting this. It seems the readiness / liveness probes are not being removed. Could you provide more details of your devspace.yaml configuration?

Typically if you're using the restartHelper configuration, the readiness / liveness probes would replaced automatically.

lizardruss avatar Sep 15 '22 22:09 lizardruss

Maybe it's an bug in v6.1.1 when still using the config version v1beta11. The config wasn't changed until now and is the same version as on devspace v5 we used before. Maybe the health-check removal is not applied when using the old config format?

I struggled with upgrading the format to the new v2beta1 version since our config was too huge to do an upgrade without breaking everything, was on my roadmap for the next weeks.

BirknerAlex avatar Sep 16 '22 07:09 BirknerAlex

I have the same issue. As a minimal example, here are my devspace.yaml and Dockerfile:

version: v1beta11
images:
  server:
    createPullSecret: false
    image: dockerregistry.example.com/foo/server
    injectRestartHelper: true
    tags:
      - dev-${DEVSPACE_USERNAME}-${DEVSPACE_RANDOM}
deployments:
  - name: to-devacct-cluster
    kubectl:
      replaceImageTags: true
      kustomize: true
      manifests:
        - manifests/overlays/development/devspace
dev:
  ports:
    - imageSelector: image(server):tag(server)
      forward:
        - port: 8080
FROM golang:1.18-alpine

ENV CGO_ENABLED=0
RUN go install github.com/go-delve/delve/cmd/[email protected]

COPY main.go /app/
WORKDIR /app/
ENV GO111MODULE=off
RUN go build -a -o /app/build/server main.go

CMD [ "dlv", "debug", "--accept-multiclient", "--continue", "--headless", "--listen=:2345", "--api-version=2", "--log", "--wd=/"]

This works fine with devspace v5.18.5, but fails with v6.1.1 (and v6.1.0)

I tried v6.0.0 as well, but that had a completely unrelated issue (Error during image registry authentication: invalid reference format)

ComaVN avatar Sep 19 '22 12:09 ComaVN

By the way, removing the health checks on demand would cause some other issues, e.g. we currently have dev tooling scripts that are waiting for a pod restarts and for the application is available again, if devspace v6 now would remove that health checks on deployment it would cause this script to break.

I would personally recommend to have the same behaviour as devspace v5 did -> maybe an optional setting to prevent the restart helper waiting for the initial sync before starting the container.

BirknerAlex avatar Sep 19 '22 12:09 BirknerAlex

Speaking of health checks... I overlay them in my manifests, could that be the issue?

kustomization.yaml:

(...)
patchesJson6902:
  - target:
      group: apps
      version: v1
      kind: Deployment
      name: foo
    path: probepatch.yaml

probepatch.yaml:

- op: replace
  path: /spec/template/spec/containers/0/livenessProbe/periodSeconds
  value: 120
- op: replace
  path: /spec/template/spec/containers/0/readinessProbe/periodSeconds
  value: 60

ComaVN avatar Sep 19 '22 13:09 ComaVN

@BirknerAlex @ComaVN I'm looking into this deeper and was wondering if either of you have tried the delay container start example with the the readiness / liveness probes in place?

lizardruss avatar Sep 21 '22 22:09 lizardruss

@BirknerAlex waitInitialSync: false may be the missing configuration that will help with this.

lizardruss avatar Sep 22 '22 00:09 lizardruss

@lizardruss I have not migrated yet to the new config format. But I will do that soonish, the property does not exist in v1beta11. Seems like the default behavior changed for the legacy config format in devspace v6.

BirknerAlex avatar Sep 23 '22 15:09 BirknerAlex

waitInitialSync false or true makes no difference in a v1beta11 config.

The problem I have, is that I have hundreds of projects with a devspace 5/v1beta11 config, and I would love to be able to gradually migrate them to devspace 6/v2beta1. So, I basically need a v1beta11 config that works with both devspace 5 and 6, but that seems impossible :(

ComaVN avatar Nov 01 '22 08:11 ComaVN

Did anybody have time to check this? It's the last piece we need to use devspace without breaking our current workflow.

DevSpace version : 6.2.5 Config version : v2beta1

Os: MacOS Ventura 13.1 Arch: arm64

Please tell me if you need more infos

edit: the mkdir -p /.devspace/ && touch /.devspace/start workaround does not work for me

also our liveness and readiness probes are done through an executable (grpc_health_probe) I dont know if thats a pb or not

a-candiotti-pvotal avatar Feb 02 '23 17:02 a-candiotti-pvotal

Also my projects and their workflows are affected by this behaviour, so...news on this issue?

zeno-mioso avatar Mar 08 '23 18:03 zeno-mioso

@zeno-mioso Could you run with the --debug flag and share the output? Providing the devspace.yaml, Dockerfile and the rendered Deployment (can run devspace deploy --render) would help us reproduce the issue.

lizardruss avatar Mar 15 '23 16:03 lizardruss

Can't help with that anymore because I migrated all my configs in the meantime to the new format but I think it has something todo with the parser of the old format where the following "new" setting is interpreted as false which should be true to keep it compatible:

The v2 option that I think is the reason while using the v1 format:

deployments:
  <app>:
     sync:
       - startContainer: true

BirknerAlex avatar Mar 16 '23 09:03 BirknerAlex

@a-candiotti-pvotal More information would be great. Would it be possible to set up an example repository that reproduces the issue?

lizardruss avatar Mar 21 '23 15:03 lizardruss

@ComaVN Sorry for the slow reply. The devspace.yaml you shared doesn't have a sync config. If this is still an issue for you, could you share the output?

When I try with a slightly modified configuration, here's the last lines of the output with DevSpace 6:

deploy:to-devacct-cluster Deploying chart /Users/russellcentanni/.devspace/component-chart/component-chart-0.8.5.tgz (to-devacct-cluster) with helm...
deploy:to-devacct-cluster Deployed helm chart (Release revision: 2)
deploy:to-devacct-cluster Successfully deployed to-devacct-cluster with helm
dev:ports-0 Waiting for pod to become ready...
dev:ports-0 Selected pod to-devacct-cluster-78cff9c6f8-rv596
dev:ports-0 ports Port forwarding started on: 8080 -> 8080

This is expected without sync, logs, or terminal configuration. My modified devspace.yaml is below. I replaced the kustomize deployment with helm, and removed the ${DEVSPACE_USERNAME} usage.

version: v1beta11
images:
  server:
    createPullSecret: false
    image: dockerregistry.example.com/foo/server
    injectRestartHelper: true
    tags:
      - dev-${DEVSPACE_RANDOM}
deployments:
  - name: to-devacct-cluster
    helm:
      replaceImageTags: true
      values:
        containers:
          - image: dockerregistry.example.com/foo/server
dev:
  ports:
    - imageSelector: image(server):tag(server)
      forward:
        - port: 8080

I've run this with Devspace 5.18.5, and it looks like the main difference is that DevSpace 5 would automatically log the container output, whereas DevSpace 6 does not:

[0:ports] Port-Forwarding: Waiting for containers to start...
[0:ports] Port forwarding started on 8080:8080 (multi-sync/to-devacct-cluster-6458599cdd-66sr8)
[info]   Starting log streaming
[to-devacct-cluster] Start streaming logs for multi-sync/to-devacct-cluster-6458599cdd-66sr8/container-0
[to-devacct-cluster] API server listening at: [::]:2345
[to-devacct-cluster] 2023-03-21T15:49:37Z warning layer=rpc Listening for remote connections (connections are not authenticated nor encrypted)
[to-devacct-cluster] 2023-03-21T15:49:37Z info layer=debugger launching process with args: [/app/__debug_bin]
[to-devacct-cluster] 2023-03-21T15:49:37Z debug layer=debugger continuing
[to-devacct-cluster] Hello, World!
[to-devacct-cluster] 2023-03-21T15:49:37Z error layer=rpc writing response:write tcp 127.0.0.1:2345->127.0.0.1:60840: use of closed network connection

So, I've added a logs configuration to it, which I think finally reproduces the issue:

dev:
  logs:
    selectors:
    - imageSelector: image(server):tag(server)

Output:

dev:ports-0 logs  Container started with restart helper.
dev:ports-0 logs  Waiting for initial sync to complete or file /.devspace/start to exist before starting the application...
dev:ports-0 logs  (Still waiting...)
dev:ports-0 logs  (Still waiting...)
dev:ports-0 logs  (Still waiting...)
dev:ports-0 logs  (Still waiting...)

I will look into resolving this, though it may be different from the original issue.

lizardruss avatar Mar 21 '23 16:03 lizardruss

I can confirm that #2608 fixes this :+1:

ComaVN avatar Apr 05 '23 09:04 ComaVN