riju
riju copied to clipboard
Run Riju on Kubernetes
I see that the self-hosting instructions are extremely reliant on cloud-specific architectural considerations (mostly AWS stuff). Is there a way to abstract that out and create a Kubernetes deployment mode or perhaps a Helm chart?
I would be glad to take this task on if I had guidance.
Thanks
I'd be happy to advise you on this if you'd like to deploy on Kubernetes. The reason for the current architectural decisions is primarily cost minimization, as I don't have anywhere near what it would cost to run a k8s cluster.
You may want to start by reviewing the current architecture (i.e. Terraform and artifact registry usage) and think about how you'd set things up if you were instead deploying to k8s. I'm happy to answer any questions you may have about details that are unexplained. Here are my initial thoughts on components that will need thought:
- We need somewhere to cache the Debian packages after they are built, unless you want a 12-hour compilation time. I'm using S3 for that right now.
- We need somewhere to host the Docker images. Riju can just as easily use any API-compatible registry, such as Docker Hub or GitHub Packages.
- The supervisor binary runs on bare metal and handles blue/green deployments currently. Some part of that could be obviated by just using Kubernetes-native container orchestration, however, the reason I didn't opt for that is deployment speed: pulling down all the language images takes over 20 minutes. With the supervisor binary, deploying a new version means only the changed images need to be pulled. You might be able to get around this with a local Docker registry cache or something, I'm not sure.
- Most of the architectural complexity is to minimize cost and compilation/deployment time, both of which are hard problems for Riju (what with there being 150 GB of Docker images that take the better part of a day to build, and what with how expensive EBS is). If you're willing to sacrifice on those, then conceptually Riju is extremely simple: it's just a Docker image for the app that has a dependency on the language Docker images being available on the host. As long as you have all the images on the host, and you're willing to mount the Docker socket into the application image, you're basically good to go.
- Don't forget you'll be hit by huge data egress costs as soon as you take any part of the system outside AWS, isn't billing fun? :)
Feel free to challenge any of my assumptions about how things have to be designed for the deployment system to be maintainable! I'm sure things could be improved.
Haha so I have a self-hosted k3s cluster with a couple of Raspberry Pi 4s and a couple of old laptops that I had lying around. I even have some fast SSD storage attached to them and the performance is surprisingly good. This seemed like a fun but complex enough project to really dig deep and learn how to build things the right way on Kubernetes, which is why opened the issue.
I do have some analogs to the current services (minio as an s3 compatible api, harbor for a docker registry), and the caching for debian packages, I'm sure it should be as simple as mounting a persistent volume.
Do you think there needs to be a complete rethinking of the infrastracture from the ground up if going into Kubernetes land in order to optimise things. For example, some of the things could be a lot more easy to enforce like the types of fine grained resource limits that it provides.
Do you think there needs to be a complete rethinking of the infrastracture from the ground up if going into Kubernetes land in order to optimise things
You tell me! I think I've already laid out all the necessary information about how things work currently, let me know if that's not the case. I think you would be the better judge of whether there's a way that things could be implemented using k8s primitives that would be simpler than what we're doing currently.
I'll take a look at stuff and get back :)
There's also a few other advantages I could see (using something like gVisor to harden the security a bit more).
I do think that could lead to you saving costs as well, if you decide to go the Kubernetes route. AWS is ridiculously priced. Charging for egress is just outrageous, and clearly designed to ensure vendor lock in.
I do think that could lead to you saving costs as well, if you decide to go the Kubernetes route
Would it? Kubernetes is just the control plane. You still have to have servers to run the actual thing, and wherever you're hosting is going to charge for network bandwidth. Where are they going to run, if not AWS?
Well, right now my Kubernetes stuff is part of my homelab. So I'm paying exactly 0$ extra on top of what I already pay to my ISP. Power costs are minuscule enough that I'm not even considering them.
Sure, it's not as reliable or as fast as an instance on AWS, but for personal projects I prefer doing it this way because you end up learning more.
However, even if you want to provide a service with more nines in the uptime, take a look at alternative services like say OVH or DigitalOcean. They come with a huge amount of bandwidth for free each month (DigitalOcean has 1TB of free egress bandwidth with the most basic 5$ per month server).
Bootstrapping something like k3s or k0s by yourself on these nodes gives you incredible value for money over using different AWS services, and above managed K8S like EKS as well (of course with the tradeoff of administering the cluster and ensuring that everything is working yourself).
Re cost implications of AWS, I've been able to start generating detailed billing breakdowns: https://github.com/raxod502/riju/blob/master/financials/2021-08/breakdown.txt
Could be helpful in evaluating the costs of AWS versus self-hosting versus other providers.
I also have a spreadsheet that (among other things)
I'm revisiting this. I think you've hit on an interesting idea to use k3s or k0s on bare metal. When I've thought about Kubernetes in the past I've been imagining the typical multi-node configuration. But there's no reason not to use the k8s API within a single, large node. And doing so might actually help reduce the amount of orchestration code I have to write, for things like lazy-pulling images and doing blue-green deployment cutover.
I'll investigate replacing the supervisor binary with single-node k8s at some point.