bigmachine icon indicating copy to clipboard operation
bigmachine copied to clipboard

Kubernetes System

Open DazWilkin opened this issue 4 years ago • 9 comments

I think it would be interesting to have a Kubernetes backed implementation.

This would provide a more generic solution than per Cloud implementations and could facilitate cross-Cloud deployments too.

DazWilkin avatar Oct 04 '19 05:10 DazWilkin

A k8S implementation sounds natural, or even, more preferable. However, from the author's description of the project, it feels like the intended direction maybe a fully self-contained app(please correct me)? but honestly, I don't see how that can work on a larger scale. if that's not the case, then this current implementation still has some work do to work with a k8s type engine.

xnwsw avatar Oct 04 '19 23:10 xnwsw

There's definitely work to be done.

Once I'm able to get the standalone code working, I'll report back.

My hunch is that Kubernetes provides a valuable solution to the problem of distributing compute over an arbitrary number of nodes (whether these be on the localhost, on Google Compute Engine, on EC2 or across them).

Providing the functionality to spin up multiple VMs on these platforms to pour a compute job onto them seems to be more work than it needs to be.

An additional possibility is whether e.g. Google App Engine or Google Cloud Run to serve as the backend fabric for these services.

As I say, I'm naive on bigmachine but, as a former Googler, familiar with other ways we've solved this.

DazWilkin avatar Oct 05 '19 00:10 DazWilkin

A k8S implementation sounds natural, or even, more preferable. However, from the author's description of the project, it feels like the intended direction maybe a fully self-contained app(please correct me)? but honestly, I don't see how that can work on a larger scale. if that's not the case, then this current implementation still has some work do to work with a k8s type engine.

I agree that k8s would be a fine backend for Bigmachine, especially as the number of k8s installations and services are increasing.

What Bigmachine provides is a relatively low-level, but still economical API, that is designed to be able to support multiple backends. The goal with Bigmachine is to provide an abstraction that lets us build applications and frameworks (like bigslice) that are independent of the underlying infrastructure.

We've been using it in a fairly large-scale way on EC2, scaling easily to 100s of instances (and tens of thousands of cores).

mariusae avatar Oct 06 '19 22:10 mariusae

I've begun work on a Kubernetes system:

https://github.com/DazWilkin/bigmachine/tree/kubernetes

DazWilkin avatar Nov 14 '19 22:11 DazWilkin

Very cool! I’ll be following along.

mariusae avatar Nov 15 '19 16:11 mariusae

Btw, let me know if there is any missing documentation that would be particularly useful for implementing a new system.

mariusae avatar Nov 15 '19 16:11 mariusae

The Kubernetes implementation is straightforward and I have this working.

I've tested on microk8s and Kubernetes Engine and will try on Digital Ocean today.

[The GCE implementation is also working but there is a vestigial bug that I need to diagnose|fix]

I've written an estimator for e that's similar to your π solution.

I will file a few FRs... I'm still unsure about bigmachine's approach but think these alternative runtimes may be useful to you to refine the system's API.

DazWilkin avatar Nov 15 '19 16:11 DazWilkin

This is at a point where it is functional: https://github.com/DazWilkin/bigmachine/tree/kubernetes

I've tested it on microk8s; Kubernetes Engine; and Digital Ocean

DazWilkin avatar Nov 18 '19 21:11 DazWilkin

Very cool, will take a look!

mariusae avatar Nov 19 '19 01:11 mariusae