example-services icon indicating copy to clipboard operation
example-services copied to clipboard

Actions and services feedback (health check questions)

Open dentarg opened this issue 4 years ago • 12 comments

Hi @chrispat and @mscoutermarsh 👋

As you have been active on this repo, I assume you have some insight in the services side of GitHub Actions. I hope you don't mind receiving some feedback here. I thought it would be quicker to reach the right people than going via [email protected].

I'm thankful for the postgres example as it brought up a few things that aren't in the docs (yet?): ports get randomly assigned (this is mentioned but I think I overlooked it as ${{ job.services.redis.ports['6379'] }} wasn't used in the YAML snippet) and that health checks are needed.

About the health checks, my tests need Memcached and RabbitMQ, and I've been successful in starting memcached, but not RabbitMQ.

Here's the services definition of my job:

    services:
      memcached:
        image: memcached:latest
        ports:
        - 11211/udp
        # needed because the memcached container does not provide a healthcheck
        options: --health-cmd "timeout 5 bash -c 'cat < /dev/null > /dev/udp/127.0.0.1/11211'" --health-interval 10s --health-timeout 5s --health-retries 5
      rabbitmq:
        image: rabbitmq:latest
        ports:
        - 5672/tcp
        # needed because the rabbitmq container does not provide a healthcheck
        options: --health-cmd "rabbitmqctl node_health_check" --health-interval 10s --health-timeout 5s --health-retries 5

It looks like the container needs to report healthy in ~30 seconds (the other time I ran this it said "waiting 32 seconds ..."):

starting
rabbitmq service is starting, waiting 29 seconds before checking again.
/usr/bin/docker inspect --format="{{if .Config.Healthcheck}}{{print .State.Health.Status}}{{end}}" 9d446afa3d021ab0c08c48fd699119a452a109783c1c2032862cc813562b7474
unhealthy
##[error]Failed to initialize, rabbitmq service is unhealthy.

I'm not sure why it isn't reporting healthy, it works locally for me, in less than 30 seconds. Maybe my computer is faster than GitHub's? :-)

Is it possible for GitHub Actions to give me more info on what's happening? Like docker inspect --format='{{json .State.Health}}'

dentarg avatar Aug 26 '19 16:08 dentarg

@dentarg I've had the same troubles trying to get RabbitMQ to start healthily here. I posted a thread at github.community with the same issue.

I'll keep an eye on both!

jpwilliams avatar Sep 18 '19 13:09 jpwilliams

@jpwilliams RabbitMQ is now starting for me, for example here: https://github.com/dentarg/actions-test/commit/ac7b4ed6aae34a1682166fbc8e03871ac80577d3/checks#step:2:128

dentarg avatar Sep 20 '19 20:09 dentarg

@dentarg Absolute star. I tried lots of different thins, but, following your example, the final changes were image from rabbitmq:management to rabbitmq:latest and the options for the service from:

options: '--health-cmd "nc -z localhost 5672" --health-interval 10s --health-timeout 10s --health-retries 6 --health-start-period 60s'

To:

options: --health-cmd "rabbitmqctl node_health_check" --health-interval 10s --health-timeout 5s --health-retries 5

Very glad it's working, but was this actually a meaningful change or have they fixed something behind the scenes anyway? :D

jpwilliams avatar Sep 20 '19 22:09 jpwilliams

@jpwilliams The changes did two things:

The difference between the rabbitmq:management and rabbitmq:latest images, is that the former includes the management plugin installed and enabled by default. (That's documented on https://hub.docker.com/_/rabbitmq)

rabbitmqctl node_health_check does a lot more than nc -z does. I got that command from https://github.com/docker-library/rabbitmq/pull/174.

However, I suspect GitHub to have tweaked things behind the scenes as I did not touch my setup between when it wasn't working (28 days ago) and now, when it is working.

dentarg avatar Sep 24 '19 00:09 dentarg

@dentarg Aye I'm aware of the two changes there, but I don't think either are what actually made it work, is all.

The inclusion of the management plugin shouldn't really affect the usual working of RabbitMQ (and I had tried without before too) and even just waiting for RabbitMQ to start (using no health checks) didn't work, so the health check wasn't the real issue.

Still, very happy it's working now and, aye, I think something changed behind the scenes, which shows GitHub hard at work on the beta! :)

jpwilliams avatar Sep 24 '19 08:09 jpwilliams

The only recent changes we have made in the are of service containers was to stop overriding the workdir and adding the workspace volume mounts. It is possible that is part of what resolved your issue.

https://github.blog/changelog/2019-09-18-improvements-to-github-actions/

If everything is good here please close out the issue.

chrispat avatar Sep 25 '19 14:09 chrispat

@chrispat can you comment on the last part of my post, how would one debug an issue like this if it happens again? is it possible for GitHub Actions to print more debug info? (see the suggestion in my post)

dentarg avatar Sep 25 '19 17:09 dentarg

@chrispat Sounds like that might have been the real fix then! Cool!

Along with @dentarg, I'd love to know any methods of debugging these issues ourselves in the future if there's a way!

jpwilliams avatar Sep 26 '19 07:09 jpwilliams

Would be nice to add the rabitmq service as an example into the repo Was looking for it myself :)

AbdealiLoKo avatar Dec 11 '19 06:12 AbdealiLoKo

Hi @jpwilliams could you please add a bit more details about your solution wtr RabbitMQ and Actions ? I am running into the issue where I can't connect to the RabbitMQ service from a pika library call. Thanks !

Roland-djee avatar Jun 03 '20 10:06 Roland-djee

Hi @Roland-djee!

My post here at the GitHub Support Community outlines it a bit better.

A working example can be seen at jpwilliams/remit/.github/workflows/pushtest.yml.

jpwilliams avatar Jun 03 '20 11:06 jpwilliams

@jpwilliams Thanks for the prompt reply ! Actually I get everything working from the service side :) Thing is, when I try to open a pika connection from a python script, it never succeeds. Have you had any more success perhaps ?

Roland-djee avatar Jun 03 '20 11:06 Roland-djee