compose icon indicating copy to clipboard operation
compose copied to clipboard

Enhance `healthcheck` blocking `dependent` behavior like lifecyle

Open loynoir opened this issue 2 years ago • 4 comments

Description

Problem

I found postgres image can't block dependent startup, which make dependent start up too early.

Actual

If dependent accept client at early time, dependent failed.

Expected

After postgres ready, dependent should not startup and accept client.

Try

  • Add /docker-entrypoint-initdb.d/wait.sh, which does not block dependent.

  • Overwrite command, which does not block dependent.

  • Besides, when overwrite entry/command, sometimes you may need to know what is the original entry/command, and call these HARD CODED command in a very ugly way.

  • https://stackoverflow.com/questions/31746182/docker-compose-wait-for-container-x-before-starting-y

I found .depends_on.X.condition works, but not very graceful to solve wait-for-container-x-before-starting-y.

  • If .interval big, run .healthcheck.test late, startdependent late, not very ideal.

  • If .interval small, run .healthcheck.test too many times, not very graceful.

Workaround

Below is my workaround.

  foo:
    ...
    depends_on:
      bar:
        condition: service_healthy
    ...
 
  bar:
    ...
    tmpfs:
      - /tmp
    healthcheck:
      test: 'if [ -e /tmp/first_run ]; then sleep 24h; else touch /tmp/first_run; bash /script/wait.sh; fi'
      timeout: 25h
      interval: 1s

Feat

Above workaround I still don't think very ideal, so I opened a feat.

Enhance healthcheck behavior like lifecyle.

  foo:
    ...
    depends_on:
      bar:
        condition: service_lifecycle_bootstrap
    ...
 
  bar:
    ...
    lifecycle:
        bootstrap: 'bash /script/wait.sh'

Related

  • https://github.com/docker/compose/issues/1809

  • https://github.com/docker/compose/issues/1510

Additional

Are there lifecyle support within container ecosystem?

Yes, vscode devcontainer support lifecycle.

https://github.com/devcontainers/spec/blob/main/schemas/devContainerFeature.schema.json#L106-L206

  • initializeCommand
  • onCreateCommand
  • updateContentCommand
  • postCreateCommand
  • postStartCommand
  • postAttachCommand

loynoir avatar May 18 '23 15:05 loynoir

the devcontainer feature you're listing here is about command hooks to run during container lifecycle, not about dependency management.

IIUC your needs, you have to wait for postgres to be running and accept connexion before your client code get started, otherwise connexion fails. This is exactly what condition: service_healthy is designed for: client container won't get started before their dependency is reported as "healthy" by the container runtime, which is easy to setup as postgres ilmage includes utility tool pg_isready.

healthcheck interval and health status unfortunately is managed by container runtime, not docker compose, so we can't tweak the interval to avoid unnecessary runs once service is started AFAICT you're looking for readyness check which isn't supported by docker engine (see https://github.com/moby/moby/issues/30860) - maybe we should introduce this feature directly in Docker Compose

ndeloof avatar May 22 '23 07:05 ndeloof

I've worked around this like so

  postgres: &postgres
    image: postgres:16
    healthcheck:
      test:
        - "CMD"
        - "bash"
        - "-c"
        - "echo 'select 1' | psql --dbname='postgres' --username=postgres"

      # Runs the healthcheck every 12 hours.  We don't particularly want this, but we are forced to specify *some*
      # interval.
      interval: 12h
      timeout: 10s
      retries: 3

      # This is what we really care about -- it amends the "interval" above, by running the health check every second,
      # stopping after 10 seconds, or if the test succeeds, whichever comes first.  The docker docs are not clear about
      # this.
      start_period: 10s
      start_interval: 1s

offby1 avatar Jul 26 '24 18:07 offby1

@offby1 this might interest you https://github.com/docker/compose/pull/12166

jhrotko avatar Oct 14 '24 17:10 jhrotko

@offby1 this might interest you #12166

Thanks. I read what docs I could find but it's not obvious that this new feature solves the problem in this issue. I assume it does, otherwise you wouldn't have drawn my attention to it. But might the docs for this new feature give an example of it solving this problem? Or at least mention that it can solve this class of problem?

offby1 avatar Oct 14 '24 23:10 offby1

What you are looking for is start_interval which let you define a higher rate to wait for service to become healthy after it just has been created, then adopt a lower pace once it's up to just offer health status.

ndeloof avatar Oct 23 '24 07:10 ndeloof

Yes, my workaround above uses that.

offby1 avatar Oct 23 '24 14:10 offby1