shoryuken
shoryuken copied to clipboard
Implementing on kubernetes ( looking for some advice ).
Hello guys we are using ruby-shoryuken on a basic k8s cluster, basically I want to know if you have some procedure to know that shoryuken is able to process data or it's "alive", we are using a basic bash line right now but we think that is not the best, do you have something for implementing a healt check ?, or something that you can recommend us to test ?.
Thanks in advance!.
It depends on what you really want to check... In my view it makes sense to split health checks two ways.
For something like a worker, it makes sense to me that a health check should monitor the health of the container and not much else. For that, an API was added into Shoryuken https://github.com/ruby-shoryuken/shoryuken/blob/d4af0447b1ecfa2cfe9bb471e7e3366b7cd03033/lib/shoryuken.rb#L48 with a view to expanding it in the future if desired. Currently it goes through and checks that all the managers have running executors.
This is unfortunately just part of the ruby API at the moment, so at work we actually boot shoryuken programmatically and during the boot process we spin up a little webserver with a health check endpoint - that endpoint calls this Shoryuken.healthy?
method. It would be difficult to move this out of the ruby API because it checks what's currently going within shoryuken itself, but maybe if we added more hooks into shoryuken to allow the user to spin up a separate thread with a webserver in it it would build this behaviour into shoryuken so we're not having to boot everything programmatically? Or maybe it makes more sense to just make it easier to boot shoryuken programmatically and that gives users all the flexibility they need? I'm not 100% sure off the top of my head but we definitely need a more ergonomic way of getting this API exposed in my opinion.
In addition, it is worth having alerts on the SQS queues themselves (either via cloudwatch or something else like newrelic) for if they start to build. This is a good indication that you need to scale up, but also works well for paging ops etc if the queue backlog starts to go way beyond acceptable levels.
It depends on what you really want to check... In my view it makes sense to split health checks two ways.
For something like a worker, it makes sense to me that a health check should monitor the health of the container and not much else. For that, an API was added into Shoryuken
https://github.com/ruby-shoryuken/shoryuken/blob/d4af0447b1ecfa2cfe9bb471e7e3366b7cd03033/lib/shoryuken.rb#L48 with a view to expanding it in the future if desired. Currently it goes through and checks that all the managers have running executors.
This is unfortunately just part of the ruby API at the moment, so at work we actually boot shoryuken programmatically and during the boot process we spin up a little webserver with a health check endpoint - that endpoint calls this
Shoryuken.healthy?
method. It would be difficult to move this out of the ruby API because it checks what's currently going within shoryuken itself, but maybe if we added more hooks into shoryuken to allow the user to spin up a separate thread with a webserver in it it would build this behaviour into shoryuken so we're not having to boot everything programmatically? Or maybe it makes more sense to just make it easier to boot shoryuken programmatically and that gives users all the flexibility they need? I'm not 100% sure off the top of my head but we definitely need a more ergonomic way of getting this API exposed in my opinion.In addition, it is worth having alerts on the SQS queues themselves (either via cloudwatch or something else like newrelic) for if they start to build. This is a good indication that you need to scale up, but also works well for paging ops etc if the queue backlog starts to go way beyond acceptable levels.
Hi @LyleDavis thanks for helping us, basically we have the first scenario a worker, the shoryuken process running inside an container on that scenario we don't have the small webserver running and that's why we are currently testing the shoryuken process by bash, we are running the worker like this scl enable rh-ruby26 -- bundle exec shoryuken -R -C config/shoryuken.yml
, so, you think it's enough just to test if the process is running ?
ex on the k8s deployment:
livenessProbe:
exec:
command:
- sh
- -c
- |-
SHORYUKEN=$(ps -ef | grep "shoryuken" | grep -v grep | awk {'print $3'} | wc -l);
if [[ ${SHORYUKEN} != 2 ]]; then exit 1; fi
Again thanks a lot.
This issue is now marked as stale because it hasn't seen activity for a while. Add a comment or it will be closed soon.
This issue was closed because it hasn't seen activity for a while.
I am also having similar issue, did livenessProbe work for you?
@TruptiHosmani The way we do this now is by booting up a tiny rack webserver during Shoryuken boot with a rack middleware that exposes a /health endpoint. That endpoint calls the Shoryuken.healthy?
method to check all relevant queue groups etc have healthy managers with running executors. This is more reliable than just checking the process exists we’ve found.
If you’re using ECS you can configure the health check to call that endpoint, in kubernetes a liveness probe can probably do the same thing with curl or something.
We were able to generalise it further so we have a health check gem that provides it all with opt-in features for checking a whole lot of different things, so the same health check gem is used throughout a bunch of different kinds of ruby services whether they’re workers or not.
However, how you boot the webserver will entirely depend on how you’re implementing your Shoryuken service I’m afraid. We’re booting Shoryuken programmatically so we can do it at that point before it gets to the point of actually running the worker, but you may be able to use Shoryuken lifecycle hooks to run a thread that boots Rack.