Kubernaut currently does not support LoadBalancer services but many people use type: LoadBalancer in their service manifests. The current work around is to use type: NodePort but this means changing manifests specifically for Kubernaut which is undesirable.

Aug 17 '17 13:08 plombardi89

Preliminary Research

I started examining this problem shortly after releasing Kubernaut. Unfortunately, I have discovered there is no simple ideal solution to this problem.

The Primary Problem

Kubernaut does not support type: LoadBalancer services because it presents a serious Quality-of-Service issue for multiple users. Here are the facts:

Each LoadBalancer service on AWS creates an Amazon ELB.
We cannot restrict the number of LoadBalancer services that can be created and therefore we cannot limit the number of Amazon ELB instances that are created.
The maximum number of Amazon ELB instances you can create is determined by your AWS account quota.
The AWS account quota can be increased, but, it is a manual (human) operation that does not have an API and needs to be justified to AWS.
When the quota is exhausted, from the perspective of the user who interacts with Kubernetes via kubectl they see their LoadBalancer service stuck in a pending state without explanation. To these users the system appears broken.
It is likely because humans do dumb or malicious things that a single user or small group of users exhausts the LoadBalancer pool for the other users causing massive QoS degradation.

Problem 2

This is a small problem and is internal to our operation of the service.

Amazon ELB instances are NOT free. They are charged at $0.025/hr (~$18.30/mo) just to run. Data is charged $0.008/GB IN/OUT data. At a minimum, assuming 1 user is consistently using the service for a whole month we have a spend of $18/user just for the ELB. Data is cheap, and they would need to pump a lot of data through the system to cost us a lot of money, but lets once again assume humans are dumb or malicious and someone decides to upload a 100MB archive in a loop through their service. We need to have monitoring in place that allows us to basically cut off the user. That is engineering effort to us to prevent someone from causing financial harm.

Problem 3

I examined whether it was possible to disable the ELB provisioning functionality in the AWS cloud provider integration for Kubernetes and the answer is that it is not possible. My idea was to disable it and replace it with a different service controller that would talk to say a HAProxy or Nginx cluster we ran that could act as a multi-tenant TCP load balancer.

A brief technical explanation:

When kube-controller-manager and kubelet come up you specifiy a cloud provider via a CLI switch. In our case we specify aws which enables the AWS integration for Kubernetes and makes it possible to run on that cloud and use that clouds API to bootstrap the cluster machinery (e.g. inspects the EC2 metadata service). It also enables stuff like using EBS volumes for storage.
There is no exposed way to override the services-controller that I could find. The services-controller is responsible for actually orchestrating the creation of the LoadBalancer (create load balancer, create firewall rules, add/remove nodes to the backend pool).

If we want this kind of customizability it seems we need to get involved in Kubernetes development process:

We need to consult the Kubernetes team on the best path forward.
- Should the ability to override the services-controller in a cloudprovider be customizable?
- If yes, is it something that is done generically outside of the AWS integration (likely) or only done for the AWS integration (maybe).
We need to modify the Kubernetes code and get it shipped in a release OR alternatively find someone to do it (e.g. ask for an enhancement in an Issue).
Deployment tools, specifically kubeadm in our case, need to be updated to expose the new configuration mechanism for specifying the new services-controller.

Sep 21 '17 02:09 plombardi89

Doing additional implementation research. To avoid hard AWS limits we will need to configure the following:

DisableSecurityGroupIngress https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/aws/aws.go#L441

ElbSecurityGroup https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/aws/aws.go#L446

Sep 22 '17 03:09 plombardi89

We will also need to adjust our usage of kubeadm during bootstrap:

https://kubernetes.io/docs/admin/kubeadm/

Specifically see "Cloudprovider integrations (experimental)"

Sep 22 '17 03:09 plombardi89

Suggested approach I learned of at Velocity NYC from Kelsey Hightower

- Use RBAC to limit to 1 if that's desired. this would make users not quite admins in kubernetes cluster but thats probably ok as it is a very small corner.
- write control loop that monitors kubernetes api and does the endpoint -> service mapping magic. Watch for "type: LoadBalancer" services.
- use PATCH mechanism to change the status of the LoadBalancer with the public IP that was created via external process (e.g. a multi-tenant nginx tcp load balancer)

Oct 04 '17 19:10 plombardi89

Discovered this while poking around for AWS and Kubernetes stability issues... this would be an additional implementation problem: https://github.com/kubernetes/kubernetes/issues/29298

Feb 07 '18 18:02 plombardi89

kubernaut
kubernaut copied to clipboard

Support LoadBalancer Services

Preliminary Research

The Primary Problem

Problem 2

Problem 3

kubernaut kubernaut copied to clipboard

Support LoadBalancer Services

Preliminary Research

The Primary Problem

Problem 2

Problem 3

kubernaut
kubernaut copied to clipboard