mybinder.org-deploy
mybinder.org-deploy copied to clipboard
Deploy gitter bot to write annotations to Grafana
Fixes #312
This is a (very) simple gitter bot that creates grafana annotations for messages that start with !log.
Can we run this on the k8s cluster as well? Or would it make sense to run it somewhere else so it doesn't go down when we have a problem? Or ...?
Running this on the cluster makes sense to me. Do you want to take a stab at that here?
I'll take a stab at making this run on the cluster. An opportunity to learn how to add new things to the helm chart. I guess if the cluster is so broken that the bot doesn't work any more chances are grafana&co are down too.
True. If you want an example, #398 is also adding a new pod to run in the cluster. I can't tell you if it's good example yet, though.
hmm, after today's incidents with Grafana, I'm wondering if we should run this in the same cluster at all. Perhaps a separate 'misc' cluster that runs our various bots and grafana? That would also allow us to put them in a separate repo.
We should definitely make this bot more robust towards grafana failures. I think our deployment got "stuck" once yesterday because it couldn't create the grafana annotation. It feels like that shouldn't happen and for this bot means it needs to have a buffer/queue of some kind.
Also wanted to say, let's not make the perfect enemy of the good, etc. So this should go ahead in whatever form makes it deployable :)
I think we should create a support cluster, and I will work on that after releasing jupyterhub 0.9 if nobody else does it first, but I think we can and should do this one in the existing cluster in the meantime, since I think it'll be handy.
@betatim would you like to take a stab at adding a chartpress config for building and using the image?
@betatim if the grafana message posting automation is blocking on this PR, why don't we start with a small addition to the binder SRE tools that will just use the gitter API to grab all the lines with !log in them and collect them as a time-stamped dataframe, or something like this?
@betatim @minrk I picked this back up, and made some progress.
- Builds image automatically
- Uses asyncio to listen to multiple rooms at the same time
Still to do:
- [x] Find a way to auth to Gitter without giving it my (irrevocable!) gitter personal access token
- [ ] Build some network resiliency into the script
Setup gitter key for this. Comes from a github bot account I created. Account details for the bot account are in secrets/bot-accounts.yaml
What's the plan here?
I'd still love to get this deployed!
If someone wants to take over this PR and finish it that would be fine with me
We don't really use gitter anymore, and the moment is lost. If only I can go back to 2018, I'd do so many things differently - including merging this PR as soon as it came up.