team-compass
team-compass copied to clipboard
Upstream the crypto-mining prevention tools we use on mybinder.org into the BinderHub helm chart?
Providing any kind of easily accessible, cloud compute resources attracts abusers, whether that infrastructure is a BinderHub, JupyterHub, or something else. However, this abuse is felt most keenly by those running community-driven, widely accessible (i.e. no auth, or auth open to anyone with a valid account) BinderHubs.
The mybinder team has spent a lot of time developing various mining-detection-and-killing tools to combat this abuse and it is a reasonable assumption that other people deploying BinderHubs would want the same toolset available. Rather than duplicating the work, can we make these tools available to be enabled and configured via the BinderHub helm chart?
We at 2i2c will soon be redeploying Pangeo's Binder instance (which has been down since December 2021 due to crypto-mining) and we would love to be able to reuse these tools. Though I recognise there is a "if you show them how you're stopping them, they'll find another way" risk here as well.
cc: @yuvipanda
How tightly coupled are the tools to mybinder.org/BinderHub? Is it worth putting them in their own repo/Helm chart and making it an optional dependency of the BinderHub helm chart, or do you think it's best to bundle it in the BinderHub repo? Is there scope for future collaboration on using these tools with other applications, e.g. https://github.com/jupyterhub/team-compass/issues/478, or are they too specific to BinderHub ?
How tightly coupled are the tools to mybinder.org/BinderHub? [...] Is there scope for future collaboration on using these tools with other applications, e.g. #478, or are they too specific to BinderHub ?
I don't know enough about the tools to answer these questions. @minrk?
Is it worth putting them in their own repo/Helm chart and making it an optional dependency of the BinderHub helm chart, or do you think it's best to bundle it in the BinderHub repo?
A separate Helm chart might be a good idea, if the tools are capable of working across multiple k8s namespaces?
For instance, 2i2c maintain a "support" chart that includes grafana, ingress-nginx, prometheus, etc - all things that get deployed once per cluster. No matter how many hubs we deploy to that cluster, they can be covered by the services provided by this chart. If the anti-mining tools can be deployed similarly, so as a dependency of the support chart rather than the BinderHub chart, that would be interesting... We might be able to protect all hubs on a cluster with a single deployment. @yuvipanda wdyt?
It's a bit of both: it's not specific to BinderHub (makes sense with JupyterHub), but it's also fairly specific to mybinder.org (e.g. there are some assumptions about how the cluster is set up, and requires extreme privileges).
I think an independent chart makes the most sense, if folks want to re-use it.
There is also the (extremely flimsy) level of obfuscation that exactly how we identity processes to terminate is encrypted, while the general collection and termination is not. We'd need to change that to expose everything we do in order to make it usable as a chart, which would also reveal how truly trivial it is to circumvent.
I do think a more robust approach would be to follow something like this (thanks @yuvipanda!).
I've experimented with the monero killing approach via bpftrace, and here's a very early helm chart that works on GKE at the least https://github.com/yuvipanda/bloodcoins. Kills monero miners immediately.

I want to figure out how many false positives this hits though.
We've deployed cryptnono to mybinder.org on both GKE and Turing, and it works fairly well! Deployment on OVH is pending https://github.com/jupyterhub/mybinder.org-deploy/issues/2176.
I think next step would be to find a way to upstream / generalize what ban.py does.