kafka-docker icon indicating copy to clipboard operation
kafka-docker copied to clipboard

Docker compose health check

Open alfreddatakillen opened this issue 8 years ago • 14 comments
trafficstars

docker-compose now supports health checks (since version 1.10.0), and delaying start-up of containers until their dependencies are up and healthy. See https://docs.docker.com/compose/startup-order/ for docs on this.

It would be great with an example of how to configure such health checks on a kafka container!

alfreddatakillen avatar Feb 09 '17 22:02 alfreddatakillen

Did you find a solution for this ?

doing something like this could work in the container :

healthcheck:
   test: ["CMD", "bash", "-c", "unset" , "JMX_PORT" ,";" ,"kafka-topics.sh","--zookeeper","zookeeper:2181","--list"]

Don't know if there is a better way ... the unset is needed when you've enabled JMX.

ddewaele avatar Feb 25 '17 20:02 ddewaele

We're facing this issue with Kubernetes. Our solution was to include a health-check.sh in our Dockerfile with the following, which goes to zookeeper and checks to see if the active brokers list returned contains the broker id of the node. It's not perfect - I'd rather directly query the node if it's up - but I don't see a way to do this and (at least in our case) Kubernetes is watching the Zookeeper cluster as well in a StatefulSet so it seems relatively safe to me.

If there's general interest and @wurstmeister is into it I'd be willing to make a PR for this.

#! /bin/bash

r=`$KAFKA_HOME/bin/zookeeper-shell.sh zk-headless:2181 <<< "ls /brokers/ids" | tail -1 | jq '.[]'`   
ids=( $r )                                                                                         
function contains() {
     local n=$#
     local value=${!n}
     for ((i=1;i < $#;i++)) {
         if [ "${!i}" == "${value}" ]; then
             echo "y"
             return 0
         fi
     }
     echo "n"
     return 1
}

x=`cat $KAFKA_HOME/config/server.properties | awk 'BEGIN{FS="="}/^broker.id=/{print $2}'`
if [ $(contains "${ids[@]}" "$x") == "y" ]; then echo "ok"; exit 0; else echo "doh"; exit 1; fi`

knordstrom avatar Jun 20 '17 17:06 knordstrom

@alfreddatakillen - The document you reference just explains how to wrap your startup process in an external script / shell call (https://docs.docker.com/v1.10/compose/startup-order/)

I assume you meant to reference https://docs.docker.com/compose/compose-file/#healthcheck - this was added in 1.12? However, this only reports the status of the container and does not block downstream dependencies. I believe this was mainly added for swarm support / restart policies

e.g.

version: '2.1'
services:
  zookeeper:
    image: wurstmeister/zookeeper
    ports:
      - "2181:2181"
    healthcheck:
      test: ["CMD-SHELL", "echo ruok | nc -w 2 zookeeper 4444"]
      interval: 5s
      timeout: 10s
      retries: 3
  kafka:
    build: .
    depends_on:
      - zookeeper
    ports:
      - "9092"
    environment:
      KAFKA_ADVERTISED_HOST_NAME: 192.168.99.100
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

This is only useful for reporting status of container. Kafka will be started immediately after Zookeeper regardless of 'healthy state'.

$ docker ps
CONTAINER ID        IMAGE                      STATUS
6c6dee3e5aae        kafkadocker_kafka          Up About a minute
c2d8a40b785e        wurstmeister/zookeeper     Up About a minute (unhealthy)

@knordstrom - the healthcheck case may work OK for k8s if it was added to the docker image.

sscaling avatar Mar 29 '18 09:03 sscaling

For future visitors that cannot get @knordstrom healthcheck to work, this is how I solved it:

Make sure that the shell script is added to a directory by your dockerfile.

We're using dynamic IDs generated by zookeeper, so I had to use following healthcheck.sh:

#! /bin/bash

unset JMX_PORT # https://github.com/wurstmeister/kafka-docker/issues/171#issuecomment-327097497

r=`$KAFKA_HOME/bin/zookeeper-shell.sh zookeeper:2181 <<< "ls /brokers/ids" | tail -1 | jq '.[]'`   
ids=( $r )                                                                                         
function contains() {
     local n=$#
     local value=${!n}
     for ((i=1;i < $#;i++)) {
         if [ "${!i}" == "${value}" ]; then
             echo "y"
             return 0
         fi
     }
     echo "n"
     return 1
}

LOG_DIR=$(awk -F= -v x="log.dirs" '$1==x{print $2}' /opt/kafka/config/server.properties)
x=`cat ${LOG_DIR}/meta.properties | awk 'BEGIN{FS="="}/^broker.id=/{print $2}'`
if [ $(contains "${ids[@]}" "$x") == "y" ]; then echo "ok"; exit 0; else echo "doh"; exit 1; fi

Finally, add following code to your docker-compose:

    healthcheck:
      test: ["CMD-SHELL", "/bin/healthcheck.sh"]
      interval: 5s
      timeout: 10s
      retries: 5

Lukkie avatar Nov 19 '18 10:11 Lukkie

There's a very nice solution here: https://github.com/confluentinc/cp-docker-images/issues/358#issuecomment-356059519

dobesv avatar Jan 09 '20 05:01 dobesv

There's a very nice solution here: confluentinc/cp-docker-images#358 (comment)

That is only for zookeeper, it doesn't help for checking when Kafka is up and ready.

matthew-d-jones avatar Jan 09 '20 07:01 matthew-d-jones

Ah you are right, sorry. I must have mixed this issue up with another one.

dobesv avatar Jan 09 '20 07:01 dobesv

@Lukkie @dobesv @matthew-d-jones @sscaling @knordstrom @ddewaele @alfreddatakillen

Hi,

Your solution seems good but I am unable to get it working unfortunately.

My kafka setup do not wait for the status to be OK in order to launch the commands !

  kafka-server:
    image: 'wurstmeister/kafka:2.12_2.5.0'
    container_name: kafka-server
    hostname: kafka-server
    ports:
      - '9092:9092'
      - '29092:29092'
      - '1099:1099'
    volumes:
      - '/c/Users/PHP/Desktop/imm/ping/healthcheck.sh:/bin/healthcheck.sh'
    environment:
    .....
    depends_on:
      - zookeeper-server
    healthcheck:
      test: ["CMD-SHELL", "/bin/healthcheck.sh"]
      interval: 5s
      timeout: 10s
      retries: 5

  kafka-setup:
    image: 'wurstmeister/kafka:2.12_2.5.0'
    hostname: kafka-setup
    container_name: kafka-setup
    command: "bash -c 'echo Waiting for Kafka to be ready... && \
                       ./opt/kafka/bin/kafka-topics.sh --create --if-not-exists --zookeeper zookeeper-server:2181 --partitions 1 --replication-factor 1 --topic test1 && \
                       ./opt/kafka/bin/kafka-topics.sh --create --if-not-exists --zookeeper zookeeper-server:2181 --partitions 1 --replication-factor 1 --topic test2'"
    environment:
      KAFKA_BROKER_ID: ignored
      KAFKA_ZOOKEEPER_CONNECT: ignored
    depends_on:
      - kafka-server

When I check for Health Logs:

First time:

{"Status":"starting","FailingStreak":1,"Log":[{"Start":"2020-06-02T14:29:38.838704407Z","End":"2020-06-02T14:29:43.530036206Z","ExitCode":1,"Output":"cat:
can't open '/kafka/kafka-logs-kafka-server/meta.properties': No such file or directory\ndoh\n"}]}

And then:

{"Status":"healthy","FailingStreak":0,"Log":[{"Start":"2020-06-02T14:35:24.372947922Z","End":"2020-06-02T14:35:26.26390467Z","ExitCode":0,"Output":"ok\n"},
{"Start":"2020-06-02T14:35:31.271064187Z","End":"2020-06-02T14:35:33.22431355Z","ExitCode":0,"Output":"ok\n"},{"Start":"2020-06-02T14:35:38.230154493Z","En
d":"2020-06-02T14:35:40.243810443Z","ExitCode":0,"Output":"ok\n"},{"Start":"2020-06-02T14:35:45.252980835Z","End":"2020-06-02T14:35:47.20696075Z","ExitCode
":0,"Output":"ok\n"},{"Start":"2020-06-02T14:35:52.211854112Z","End":"2020-06-02T14:35:54.159074005Z","ExitCode":0,"Output":"ok\n"}]}

And the errors returned from my kafka-setup service:

Waiting for Kafka to be ready...
Error while executing topic command : Replication factor: 1 larger than available brokers: 0.
[2020-06-02 14:29:40,482] ERROR org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 1 larger than available brokers: 0.
 (kafka.admin.TopicCommand$)

Means that the commands launched before getting the OK status from Healthcheck.

How did you able to get it working please ?

Thank you all !

KabDeveloper avatar Jun 02 '20 14:06 KabDeveloper

This seems to work ok for me for checking whether Kafka is up or not:

nc -z localhost 9091 || exit 1

(Note that I personally have Kafka running on port 9091, not 9092.)

It's basic but it seems to be good enough to keep my other containers from starting up before Kafka is actually ready to start receiving traffic.

TaylorSMarks avatar Jan 19 '23 16:01 TaylorSMarks

This healthcheck works perfect for me:

    image: bitnami/kafka:3.4.0
    ...
    healthcheck:
      test: kafka-cluster.sh cluster-id --bootstrap-server localhost:9092 || exit 1
      interval: 1s
      timeout: 60s
      retries: 60

pprishchepa avatar Sep 04 '23 18:09 pprishchepa

    healthcheck:
      test: ["CMD-SHELL", "pgrep -f 'kafka.*9101' || exit 1"]
      interval: 2m
      timeout: 10s
      retries: 3

jagatsingh avatar Dec 19 '23 06:12 jagatsingh

Also better option is to use

test: ["CMD-SHELL", "(echo > /dev/tcp/kafka1/9092) &>/dev/null && exit 0 || exit 1"]

anvaari avatar Feb 12 '24 15:02 anvaari

Here is the solution for quay.io/debezium/kafka:2.5. Debezium won't install nc.

Tracking https://debezium.zulipchat.com/#narrow/stream/302529-community-general/topic/Kafka.20health.20check/near/433819156

  kafka:
    image: quay.io/debezium/kafka:2.5
    ports:
      - 9092:9092
    depends_on:
      zookeeper:
        condition: service_healthy
    environment:
     - ZOOKEEPER_CONNECT=zookeeper:2181
    healthcheck:
      test: /kafka/bin/kafka-cluster.sh cluster-id --bootstrap-server kafka:9092 || exit 1
      interval: 1s
      timeout: 60s
      retries: 60

alberttwong avatar Apr 17 '24 03:04 alberttwong