kamal icon indicating copy to clipboard operation
kamal copied to clipboard

Custom healthcheck options don't work at role-level

Open fschueller opened this issue 1 year ago • 2 comments

I tried setting a custom log_lines attribute on the role-level, which did not persist to the configuration. In fact, any custom value I set on the role-level healthcheck did not get accepted into the configuration that gets loaded by kamal. The documentation reads as if this should be working though.

Configs to reproduce:

deploy.yml

service: test

image: test/test-image

registry:
  username: testbot
  password: password

deploy.staging.yml

servers:
  web:
    hosts:
      - staging-server-1
      - staging-server-2
    healthcheck:
      log_lines: 100
      max_attempts: 4
>> kamal config -d staging
---
:roles:
- web
:hosts:
- staging-server-1
- staging-server-2
:primary_host: staging-server-1
:version: [tag]
:repository: test/test-image
:absolute_image: test/test-image:[tag]
:service_with_version: test-[tag]
:volume_args: []
:ssh_options:
  :user: root
  :port: 22
  :keepalive: true
  :keepalive_interval: 30
  :log_level: :fatal
:sshkit: {}
:builder: {}
:logging:
- "--log-opt"
- max-size="10m"
:healthcheck:
  path: "/up"
  port: 3000
  max_attempts: 7
  exposed_port: 3999
  cord: "/tmp/kamal-cord"
  log_lines: 50

Setting it at the top-level health-check succeeds:

deploy.staging.yml

healthcheck:
  log_lines: 100
  max_attempts: 4

servers:
  web:
    hosts:
      - staging-server-1
      - staging-server-2
>> bin/kamal config -d staging
---
:roles:
- web
:hosts:
- staging-server-1
- staging-server-2
:primary_host: staging-server-1
:version: [tag]
:repository: test/test-image
:absolute_image: test/test-image:[tag]
:service_with_version: test-[tag]
:volume_args: []
:ssh_options:
  :user: root
  :port: 22
  :keepalive: true
  :keepalive_interval: 30
  :log_level: :fatal
:sshkit: {}
:builder: {}
:logging:
- "--log-opt"
- max-size="10m"
:healthcheck:
  path: "/up"
  port: 3000
  max_attempts: 4
  exposed_port: 3999
  cord: "/tmp/kamal-cord"
  log_lines: 100

fschueller avatar Nov 20 '23 14:11 fschueller

There's another issue that's related due to the healthcheck tied to the primary role and not being role specific. If I try to deploy to a non-primary role without deploying to a primary role, e.g.

kamal deploy -d qa -r=cron

Kamal tries to run a health check against a web role which is not available on the cron host. It looks for .kamal/env/roles/service-web-qa.env which is not available on the host with the cron role.

medius avatar Jan 23 '24 22:01 medius

@fschueller @medius The healthcheck handling changed in Kamal 1.6.0 via https://github.com/basecamp/kamal/pull/740

Can you check if this is still issue for you?

morgoth avatar Jun 07 '24 06:06 morgoth

I'll close this now, that healthcheck section has been removed in Kamal 2

djmb avatar Sep 30 '24 07:09 djmb