Kubernetes compatible discovery method
TL;DR - DNS based discovery will be necessary. The rest is just about how to configure k8s for best performance. DNS discovery configuration should be configured via environment variables to make life easy.
Kubernetes could also do discovery via k8s API from within the pod. DNS discovery + peer exchange is likely the best option, but a call to the API to get the IPs of pods in the same deployment remove the need for peer exchange. This also has the benefit of being zero config, other than maybe some RBAC and service account config.
It should also have the following setup, imo, for best performance:
- DaemonSet (1 pod per node), or a node affinity policy that forbids 2 exo pods on the same node.
- QoS class Guaranteed (Request = Limit)
- GPU Access (this is pretty standard, just a matter of asking for it in the manifests)
These should allow it to run close to native speeds. Multiple pods on the same node are likely to fight for resources. QoS class guaranteed will prevent noisy neighbors from impacting the pods, at the cost of making those resources unavailable for scheduling. Might be able to do best effort or burstable, but may give unreliable performance.
With DNS based discovery it should work relatively well. service.namespace.svc.cluster.local with a headless service will return the IP of all pods selected by that service. At very high pod counts, it won't list all pods, but the pod ips will rotate, and since all pods can execute this query, they would only need to exchange known peers.
DNS is needed because multicast is likely unavailable. No cloud env that I know of supports multicast/broadcast, and most, if not all, CNI drivers also do not support it.
If I can find the time, id be happy to add both DNS and kubernetes API discovery.
NOTE: peer exchange is only really necessary for DNS when the count of addresses exceed a count where not all addresses are returned from DNS in one query.
The DNS method seems to cover many use cases Such as:
- #724
- #711
- #670
- #187
- #363
Either via DNS config or discovery system like consul, or via host file. Peer exchange would also be valuable here.
I have a heterogeneous garage cluster which I would be happy to use for testing should that be helpful.
Hi @jasoncouture ,
Thanks for working on k8s support.
Kubernetes discovery via DNS/headless service looks a good option, as it doesn't requires extra autorizations (ex: API access) or extra components (consul).
K8S deployements with optional "one pod per node affinity" might be more flexible than Daemonset. Some may want to run multiple pods on the same node (event if it's less efficient)
I guess we'll need a docker image to run exo as container. Did you already build one ?
Kubernetes discovery via DNS/headless service looks a good option, as it doesn't requires extra autorizations (ex: API access) or extra components (consul).
Inside cluster access is generally available, or requires an RBAC configuration for the service account. No credentials are required. I agree DNS is the better path to start with, but the k8s API has value that should perhaps be discussed separately.
K8S deployements with optional "one pod per node affinity" might be more flexible than Daemonset. Some may want to run multiple pods on the same node (event if it's less efficient)
This is just for documentation. But that's fine.
I guess we'll need a docker image to run exo as container. Did you already build one ?
Have not yet. It's on my longer todo list.
Hi,
I played around with exo and wrote a Dockerfile: https://github.com/ofauchon/exo/tree/dockerfile/docker/alpine
It's based on alpine, which complicate a bit the work:
The image takes ~1h to build on my Core i5-8250U (I guess opencv-python pre-build python wheel packages are unavailable for Alpine. (Maybe related to Alpine using musl library .. not sure)
Anyway, I could run 8 docker images at the same time :
I'll try to implement some code for discovery-consul to run it on kubernetes (This way, I won't interfere with your DNS/k8s API work)
...But not sure I can do this, I'm beginner with Python ...
@ofauchon You're right about the Alpine-musl thing, using a glibc base would avoid the opencv compile. A lot of Python packages don't distribute builds for musl.
I'll probably make some tests with busybox:glibc or debian to speed up image builds.
I think it'd be fine to use a Debian image. Slightly wasteful, sure. But the cost of alpine is high.
I'd move the docker container out of this issue and mark this issue as a blocker.
DNS discovery does not require docker. While this issue references kubernetes, it provides value in other situations as well.
I think it'd be fine to use a Debian image. Slightly wasteful, sure. But the cost of alpine is high.
I agree.. When you have to admit that your choice is making things more difficult, especially for something meant to be easy ... :)
I took a stab at this, but I'm not python guy.
I'm having a chicken and egg problem. I need to create the peer to call it's grpc to get it's peers, but I don't know the peer id until I get the peers.
Can I just fake an ID, and use only the response, discarding the peer I used, until it's ID is known?
I think it'd be fine to use a Debian image. Slightly wasteful, sure. But the cost of alpine is high.
I agree.. When you have to admit that your choice is making things more difficult, especially for something meant to be easy ... :)
Premature optimization is the root of all evil.
The size of the python deps far outweigh the os size.
But a compromise might be this:
https://hub.docker.com/r/jeanblanchard/alpine-glibc
Using glibc rather than musl should make things work fine.
Another option: https://hub.docker.com/_/python
Hi.
I spent some time playing around with consul, but I'm wondering if it's the good way to go...
Consul architecture offers great flexibility (high availability, multi-clusters, service checks, TTL, metadatas .. etc), but It also adds extra complexity (Deployment and configuration of a 3rd party consul cluster and agents).
In contrast, using Kubernetes API/DNS doesn't require any extra component. I'm now convinced it's the best solutions to start with.
The only difficulty I see is that Kubernetes API/DNS can't store metadatas (capabilities). Right ?
As a workaround, can we imagine the following solution ?
- Kubernetes starts Exo instances (sts, daemonset, deployement) and stores IP in its registry (service endpoint)
- Exo instance queries Kubernetes API and get the IPs of all other peers.
- Exo instance periodically contacts other instances and requests their 'status/metadata/capability' through a new http service ?
Thanks for your comments.
PS/I know it's now a bit off-topic, but you can find below a working Dockerfile, Github workflow, and Docker Hub image, if someone else needs it:
https://github.com/ofauchon/exo/blob/consul/docker/debian/Dockerfile https://github.com/ofauchon/exo/blob/consul/.github/workflows/docker-image.yaml https://hub.docker.com/repository/docker/ofauchon/exo
@ofauchon This should be scoped to DNS. I personally won't use consul willingly due to past experiences. This was targeted at things compatible with K8S out of the box. DNS has use both within k8s (with headless services), and outside of k8s.
Consul is its own thing.
That said, it would be valuable separately for people who do use consul, inside or outside of k8s. I'd open a separate issue.
Following