kamal icon indicating copy to clipboard operation
kamal copied to clipboard

Always pass health check options to kamal-proxy commands

Open nickhammond opened this issue 2 months ago • 1 comments

kamal-proxy has the following defaults for the healthcheck when kamal runs the kamal-proxy deploy command. These are documented on the kamal-proxy repo and mostly on the deploy command docs page but what if we made it super obvious instead by always including it with the kamal-proxy deploy command?

--health-check-path /up
--health-check-port 80

Right now, a kamal-proxy deploy command runs with this in Kamal:

  • You can see the container port of 80 if you look at the target but nothing about the healthcheck or /up.
  INFO [6c77e456] Running docker exec kamal-proxy kamal-proxy deploy shipyrd-web-production \
--target="205f017b8455:80" --host="app.shipyrd.io" --tls --deploy-timeout="30s" --drain-timeout="30s" \
--buffer-requests --buffer-responses --log-request-header="Cache-Control"  \
--log-request-header="Last-Modified" --log-request-header="User-Agent" on 144.1.2.3

Adding the health check details adds some helpful context:

  • target is mapped to port 80, health check is /up, health check port is 80
  INFO [6c77e456] Running docker exec kamal-proxy kamal-proxy deploy shipyrd-web-production \
--target="205f017b8455:80" --health-check-path="/up" --health-check-port=80 \
--host="app.shipyrd.io" --tls --deploy-timeout="30s" --drain-timeout="30s" \
--buffer-requests --buffer-responses --log-request-header="Cache-Control"  \
--log-request-header="Last-Modified" --log-request-header="User-Agent" on 144.1.2.3

The failure message when the app container is exposing 3000 instead of 80 doesn't mention anything about the port and it's really just a port mapping issue that you can set app_port: 3000 to resolve.

(From Discord thread, not my deploy https://discord.com/channels/1084848369073131531/1426128512779288606)

ERROR Failed to boot web on 78.47.111.87
  INFO First web container is unhealthy on 78.47.111.87, not booting any other roles
  INFO [fdb22813] Running docker container ls --all --filter name=^spree-app-web-4f1d0004e518d1b89121d0990b33e8a858c1a941$ --quiet | xargs docker logs --timestamps 2>&1 on 78.47.111.87
  INFO [fdb22813] Finished in 0.260 seconds with exit status 0 (successful).
 ERROR 
  INFO [4e5ccc0a] Running docker container ls --all --filter name=^spree-app-web-4f1d0004e518d1b89121d0990b33e8a858c1a941$ --quiet | xargs docker inspect --format '{{json .State.Health}}' on 78.47.111.87
  INFO [4e5ccc0a] Finished in 0.263 seconds with exit status 0 (successful).
 ERROR null
  INFO [e56ce71c] Running docker container ls --all --filter name=^spree-app-web-4f1d0004e518d1b89121d0990b33e8a858c1a941$ --quiet | xargs docker stop on 78.47.111.87
  INFO [e56ce71c] Finished in 10.560 seconds with exit status 0 (successful).
  Finished all in 85.9 seconds
Releasing the deploy lock...
  Finished all in 88.4 seconds
  ERROR (SSHKit::Command::Failed): Exception while executing on host 78.47.111.87: docker exit status: 1
docker stdout: Nothing written
docker stderr: Error: target failed to become healthy within configured timeout (30s)

The target failed to become healthy because the app container is exposing 3000 but the error just mentions a timeout. It would also be helpful to mention the healthcheck details in the failure message as well.

docker stderr: Error: target failed to become healthy within configured timeout (30s)

Something like:

docker stderr: Error: target failed to become healthy within configured timeout (30s) at /up on port 80

I can open a PR for this but just wanted to write out some ideas before moving forward.

nickhammond avatar Oct 11 '25 08:10 nickhammond

Yes sounds great @nickhammond, let's go for it.

djmb avatar Oct 19 '25 13:10 djmb