portainer icon indicating copy to clipboard operation
portainer copied to clipboard

Docker Healthcheck support on Portainer Container

Open JaneX8 opened this issue 4 years ago • 22 comments

Describe the feature Being able to see a "health status" of the Portainer Docker container.

Describe the solution you'd like I would like support for the Docker Healthcheck (that is also shown in Portainer.io 's own dashboard and probably other Docker management software).

Describe alternatives you've considered Alternative is setting up something similarly without the use of the already existing tools within Docker.

Additional context The Dockerfile could contain something like this:

HEALTHCHECK --interval=60s --timeout=10s --retries=3 CMD curl -sS http://localhost:9000 || exit 1.

For debugging and testing purposses you can use:

docker inspect --format "{{json .State.Health}}" containername

image

JaneX8 avatar Feb 23 '20 20:02 JaneX8

This is indeed a very useful suggestion. I also have been thinking on how to do this since some time. Please find a couple of comments from my own experience.

First, I wouldn't advise on using curl like suggested in this ticket because then we need to ship the curl binary (and dependencies) inside the container as well. I would also advise to not force the healthcheck in the Dockerfile using the HEALTHCHECK directive.

Instead, I propose to implement a simple healthcheck routine in the Portainer binary itself that can then be used by Docker during healthchecks. In this case, Portainer can dial to itself requesting a status update and return the appropriate result and exit level if HTTP code is 2XX or non 2XX.

Luckily, Portainer already implements a status API endpoint that can be leveraged for this proposal. Therefore we just need to implement a simple flag, e.g. --healthcheck for the Portainer binary that calls its own Status API, return the results and exits with an appropriate error level.

For example:

# healthy case
$ portainer --healthcheck; echo $?
{"Authentication":true,"EndpointManagement":true,"Snapshot":true,"Analytics":false,"Version":"1.23.1"}
0

# unhealthy case
$ portainer --healthcheck; echo $?
{"err": "Something bad happened"}
1

With the above in place, then healthchecks can be enabled in a Portainer stack with the following:

healthcheck:
  test: ['CMD', 'portainer', '--healthcheck']

For reference, this is how the Kong API Gateway does healthcheck, i.e. kong up command in a stack, and how PostgreSQL does it as well, i.e. pg_isready command also in a stack. This approach is more robust, requires no additional dependencies and can be smarter than just checking if the server responds via HTTP, i.e. return more elaborate status reports.

Moreover, this same approach can also be implemented for the Portainer Agent binary.

@itsconquest if you and the Portainer team agree on this idea, I can work on it relatively quick as it doesn't involve working with UI elements and I can easily test on my side.

hhromic avatar Mar 03 '20 14:03 hhromic

@ElleshaHackett In curl-enabled containers, I mostly curl the page and grep a part of the "good" status page. Works like a charm and checks more than just http 200. Your example just checks if "something" is served with http 200 on port 9000. Thats not enough to verify portainer is actually processing requests.

@hhromic This would indeed be a nice way to go. Without curl that's not an option, so this would be very nice to have indeed. Did you actually start working on it?

Ornias1993 avatar May 17 '20 09:05 Ornias1993

@Ornias1993 no I have not started working on this :) I was first waiting for some input from the Portainer team as in if they are interested, but then I forgot about this issue hehe.

@deviantony @itsconquest now that I've got more familiar with the Portainer codebase, perhaps I can code a prototype and submit as a PR for review?

hhromic avatar May 17 '20 11:05 hhromic

@hhromic Ahh, okey... Happens the best of us :)

I read through most of the previous discussions about it. Afaik @deviantony and @itsconquest arn't against it, but no-one actually takes it on or finishes it.

I think the fastest way of getting feedback is throwing in a prototype and work from there indeed. 👍

Ornias1993 avatar May 17 '20 11:05 Ornias1993

Alright then, I'll put a prototype together this week and see how it goes !

hhromic avatar May 17 '20 11:05 hhromic

Sounds like a good idea! I look forward to reviewing your work @hhromic :)

ghost avatar May 20 '20 22:05 ghost

Could be good also to have control over the healthcheck of the image or even disable the healthcheck according to https://docs.docker.com/engine/reference/run/#healthcheck

rhuanbarreto avatar Jul 07 '20 08:07 rhuanbarreto

@rhuanbarreto You can always overrule it in docker. So thats a given.

Ornias1993 avatar Jul 07 '20 08:07 Ornias1993

Yes. But is it possible to do it in portainer?

rhuanbarreto avatar Jul 07 '20 09:07 rhuanbarreto

Thats not the scope of this issue, there is another issue for handling healthchecks inside portainer though.

Ornias1993 avatar Jul 07 '20 09:07 Ornias1993

Actually this was already implemented way before this issue... See https://github.com/portainer/portainer/pull/1366

And got reverted just because it isn't compatible with the --ssl flag (which makes it unsuitable to add to the dockerfile).

Ornias1993 avatar Nov 01 '20 13:11 Ornias1993

Hey guys,

Just stumbled across this, was there any movement on the --healthcheck? I understand there were a few issues with the previous solution

Thanks!

modem7 avatar Jan 10 '21 00:01 modem7

Maintainers are not interested it seems. And don't even care enough to just say so.

Ornias1993 avatar Jan 10 '21 12:01 Ornias1993

Would really like this feature also, it's a little odd that a platform designed for managing and monitoring your docker containers doesn't include the option to monitor itself. 🤷‍♂️

kwilliams1987 avatar Jan 15 '21 17:01 kwilliams1987

@hhromic was there any updates your end?

modem7 avatar Jan 16 '21 00:01 modem7

@modem7 , all, Apologies, I've been really busy in the last months with work so I haven't had the time I wish I had to work on this. I someone wants to step-up, please do so, otherwise I will try to get back to this as soon as I can.

hhromic avatar Jan 16 '21 11:01 hhromic

Sorry for the silence on that one, we're interested in that feature it's just that we have a lot of stuff to deal with as well.

We've been giving it more thoughts and we're thinking about bringing support for this feature along https://github.com/portainer/portainer/issues/821, this should work around the potential issue we had so far with HTTP/HTTPS and the healthcheck.

We have https://github.com/portainer/portainer/issues/821 in our backlog at the moment and we'll start thinking about this one based on the existing implementations that have been provided by contributors.

deviantony avatar Jan 19 '21 20:01 deviantony

Any news on this one? Also, for docker-compose also see https://docs.docker.com/compose/compose-file/compose-file-v3/#healthcheck. Although I am trying something with that now but I'm stuck with https://github.com/portainer/portainer/issues/1454.

JaneX8 avatar Oct 20 '21 09:10 JaneX8

You could use the portainer/portainer-ce:2.9.1-alpine image instead of the normal image which is based on scratch https://github.com/portainer/portainer/issues/1364#issuecomment-922566640.

huib-portainer avatar Oct 20 '21 23:10 huib-portainer

My current solution, albeit not the best is:

https://hub.docker.com/r/modem7/portainer-business and https://hub.docker.com/r/modem7/portainer

FROM portainer/portainer-ce:2.11.1-alpine

RUN apk --update --no-cache add curl && rm -rf /var/cache/apk/*

HEALTHCHECK --interval=10s --timeout=5s --start-period=20s --retries=3 CMD curl --fail http://127.0.0.1:9000/api/status || exit 1

modem7 avatar Apr 04 '22 12:04 modem7

First, I wouldn't advise on using curl like suggested in this ticket because then we need to ship the curl binary (and dependencies) inside the container as well. I would also advise to not force the healthcheck in the Dockerfile using the HEALTHCHECK directive.

No need to include the curl binary + dependencies. Can always do a simple run-time healthcheck as documented in Docker documentation. This container image includes wget and simply hitting the main app's UI endpoint should be enough. (Also be sure to include the --no-check-certificate flag if using a self-signed certificate for your Portainer container. If you don't include this flag, wget will prematurely fail.)

Quick examples...

Docker CLI:

docker run -d -p 8000:8000/tcp -p 9443:9443/tcp --name portainer \
  --restart=always \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v portainer_data:/data \
  -p 9443:9443 \
  --health-cmd="wget --no-verbose --tries=1 --spider --no-check-certificate https://localhost:9443 || exit 1" \ 
  --health-interval=30s \
  --health-timeout=5s \
  --health-retries=3 \
  --health-start-period=20s \
  portainer/portainer-ce:alpine

Docker Compose:

...
  portainer:
    ...
    ports:
      - 9443:9443
    healthcheck:
      test: "wget --no-verbose --tries=1 --spider --no-check-certificate https://localhost:9443 || exit 1"
      interval: 30s
      timeout: 5s
      retries: 3
      start_period: 20s

kaffolder7 avatar Jun 10 '22 05:06 kaffolder7

So, are there any plans to add portainer --healthcheck, possibly with a --no-tls extra flag?

MetalArend avatar Aug 01 '22 07:08 MetalArend

Docker Compose:

...
  portainer:
    ...
    ports:
      - 9443:9443
    healthcheck:
      test: "wget --no-verbose --tries=1 --spider --no-check-certificate https://localhost:9443 || exit 1"
      interval: 30s
      timeout: 5s
      retries: 3
      start_period: 20s

When using this approach I get the following error (Raspberry Pi 4):

OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: exec: "/bin/sh": stat /bin/sh: no such file or directory: unknown

Any suggestions?

kwilliams1987 avatar Oct 10 '22 08:10 kwilliams1987

Docker Compose:

...
  portainer:
    ...
    ports:
      - 9443:9443
    healthcheck:
      test: "wget --no-verbose --tries=1 --spider --no-check-certificate https://localhost:9443 || exit 1"
      interval: 30s
      timeout: 5s
      retries: 3
      start_period: 20s

When using this approach I get the following error (Raspberry Pi 4):

OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: exec: "/bin/sh": stat /bin/sh: no such file or directory: unknown

Any suggestions?

Make sure to use the alpine tags (e.g. portainer/portainer-ee:alpine). Their normal tags are based on scratch which doesn't have a shell, much less things like wget.

modem7 avatar Oct 11 '22 09:10 modem7

portainer/portainer-ee:alpine

Also does not have /bin/bash

sgtcoder avatar Nov 05 '22 18:11 sgtcoder

portainer/portainer-ee:alpine

Also does not have /bin/bash

Why do you need bash?

modem7 avatar Nov 05 '22 18:11 modem7

portainer/portainer-ee:alpine Also does not have /bin/bash

Why do you need bash?

Trying to implement a healthcheck. Is there no support for that?

sgtcoder avatar Nov 05 '22 18:11 sgtcoder

portainer: error: unknown long flag '--healthcheck', try --help

sgtcoder avatar Nov 05 '22 18:11 sgtcoder

portainer/portainer-ee:alpine Also does not have /bin/bash

Why do you need bash?

Trying to implement a healthcheck. Is there no support for that?

You don't need bash to implement the healthcheck. Ash and Bourne are more than sufficient (you don't need those either tbf).

Healthcheck:

    healthcheck:
      test: "wget --no-verbose --tries=1 --spider --no-check-certificate http://localhost:9000 || exit 1"
      interval: 60s
      timeout: 5s
      retries: 3
      start_period: 20s

Full example:

 #Portainer EE - Docker Frontend/GUI
  portainer:
    image: portainer/portainer-ee:alpine
    container_name: Portainer
    hostname: Portainer
    command: -H unix:///var/run/docker.sock
    logging:
      driver: "json-file"
      options:
        max-size: 10m
        max-file: "3"
    networks: #Prevents Docker from creating a default stack network
      pihole:
        ipv4_address: '172.22.0.101'
    ports:
      - "9000:9000/tcp"
      - "8000:8000/tcp"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - $USERDIR/Portainer:/data
      - /etc/localtime:/etc/localtime
    healthcheck:
      test: "wget --no-verbose --tries=1 --spider --no-check-certificate http://localhost:9000 || exit 1"
      interval: 60s
      timeout: 5s
      retries: 3
      start_period: 20s
    restart: always
    mem_limit: 250m
    mem_reservation: 100m

modem7 avatar Nov 05 '22 18:11 modem7

Thank you. That ended up working. It might have been because I was using curl, which then I was going to bash into the container and test the command.

I also noticed the mem_limit... is there like a cpu_limit as well?

sgtcoder avatar Nov 05 '22 18:11 sgtcoder