zoe icon indicating copy to clipboard operation
zoe copied to clipboard

Kubernetes backend not compatible with Google GKE

Open mvdbosch opened this issue 6 years ago • 2 comments

The Kubernetes backend is not compatible with GKE (Google Kubernetes Engine).

Rationale for wanting this:

  • Very cost-effective and auto-scalable GKE backend (less worrying about managing and scaling the clusters)
  • Compute/Storage is cost-effective and scalable on Google Cloud
  • Having an end-user self-service application (i.e. Zoe) on top of it, makes the apps available for our Data Science teams without them having to worry about access to the kubectl and Google Cloud Console.

Some of the issues I faced:

  • Small discrepancy in Kubernetes metrics, causes zoe-master not to start (can be patched by fixing the parsing of the returned kubernetes metrics)
  • Endpoint details not visible in execution view
  • Leverage the build-in Ingress / LoadBalancer options (I experimented with applying some small changes in the kubernetes backend to get this to work)
  • Some form of security is recommended i.c.w. the LoadBalancer option, as it would expose the service directly on an external IP. This is desired for end-user ease of use, but shifts the authorization responsibility to the ZApp.
  • Issue with shared workspace disk (GKE is a managed cluster instance of Kubernetes. It is therefore not possible to mount a shared disk at e.g. /mnt/zoe-workspace). Preferred solution would be to mount a Google Storage Bucket (gcp:// ) in combination with a fuse driver. As a workaround, this could be integrated in the docker image/ZApp. However, a more elegant solution would be preferred for this.

mvdbosch avatar Dec 30 '18 18:12 mvdbosch

In my forked repository, I have worked on the following points:

  • [Done] In the Kubernetes backend, added a property to the ZApp definition to switch from NodePort towards LoadBalancer (the Google Cloud native LoadBalancer implementation) to expose ports other than HTTP.

  • [ To Do ] Endpoint handling should be different, when LoadBalancer service type is used, instead of the Ingress or internal display.

  • [ Done] Additional configuration flags and handling logic to ensure endpoints are shown, when using the Kubernetes ingress through NGINX

  • [ Done ] Fixed an issue with the metrics returned from GKE

  • [ Done ] Changed the kubernetes-auto-ingress, to have a flag to have it been picked-up by kube-lego. That way, a SSL certificate from Let's encrypt can be requested and served through the NGINX reverse proxy and effectively have SSL termination towards the ZApps.

mvdbosch avatar Jan 03 '19 20:01 mvdbosch

Thank you for this work, the Kubernetes backend was in need of some attention. If you do a merge request, I will review and accept it. Thanks!

dvenza avatar Jan 18 '19 17:01 dvenza