k8gb icon indicating copy to clipboard operation
k8gb copied to clipboard

Implement weighted round robin load balancing strategy

Open donovanmuller opened this issue 5 years ago • 5 comments

As per the supported load balancing strategies in the initial design a weighted round robin strategy should be implemented to ensure the guarantees stated:

Weighted round robin - Specialisation of the above (default round robin #45) strategy but where a percentage weighting is applied to determine which cluster's Ingress node IPs to resolve. E.g. 80% cluster X and 20% cluster Y

Scenario 1:

  • Given 2 separate Kubernetes clusters, X, and Y
  • Each cluster has a healthy Deployment with a backend Service called app and that backend service exposed with a Gslb resource on cluster X as:
apiVersion: ohmyglb.absa.oss/v1beta1
kind: Gslb
metadata:
  name: app-gslb
  namespace: test-gslb
spec:
  ingress:
    rules:
      - host: app.cloud.example.com
        http:
          paths:
            - backend:
                serviceName: app
                servicePort: http
              path: /
  strategy: roundRobin 
    weight: 80%

and a Gslb resource on cluster Y as:

apiVersion: ohmyglb.absa.oss/v1beta1
kind: Gslb
metadata:
  name: app-gslb
  namespace: test-gslb
spec:
  ingress:
    rules:
      - host: app.cloud.example.com
        http:
          paths:
            - backend:
                serviceName: app
                servicePort: http
              path: /
  strategy: roundRobin 
    weight: 20%
  • Each cluster has one worker node that accepts Ingress traffic. The worker node in each cluster has the following name and IP:
cluster-x-worker-1: 10.0.1.10
cluster-y-worker-1: 10.1.1.11

When issuing the following command, curl -v http://app.cloud.example.com, I would expect the IP's resolved to reflect as follows (if this command was executed 6 times consecutively):

$ curl -v http://app.cloud.example.com # execution 1
*   Trying 10.0.1.10...
...

$ curl -v http://app.cloud.example.com # execution 2
*   Trying 10.0.1.10...
...

$ curl -v http://app.cloud.example.com # execution 3
*   Trying 10.0.1.10...
...

$ curl -v http://app.cloud.example.com # execution 4
*   Trying 10.0.1.10...
...

$ curl -v http://app.cloud.example.com # execution 5
*   Trying 10.1.1.11...
...

$ curl -v http://app.cloud.example.com # execution 6
*   Trying 10.1.1.11...
...

The resolved node IP's that ingress traffic will be sent should be spread approximately according to the weighting configured on the Glsb resources. In this scenario that would be 80% (4 out of 6) resolved to cluster X and 20% (2 out of 6) resolved to cluster Y.

NOTE:

  • The design of the specification around how to indicate the weighting as described in this issue is solely for the purpose of describing the scenario. It should not be considered a design.
  • The scenario where there are more than 2 clusters is currently undefined. I.e. how do the weightings get distributed in the event of missing weightings or uneven weightings? E.g. Given 3 clusters but only 2 Gslb resources in 2 clusters have a weight specified (that might or might not add up to 100%). How does that affect the distribution over 3 clusters?
  • Following on from the above, in the scenario where Deployments become unhealthy on a cluster, then the weighting should be adjusted to honour the weighting across the remaining clusters with healthy Deployments

donovanmuller avatar Feb 24 '20 20:02 donovanmuller

The round robin is taken care of by our library https://github.com/k8gb-io/go-weight-shuffling, which shuffles the indexes in the array according to the predefined weights.

The library would run in the new CoreDNS external plugin within https://github.com/k8gb-io/coredns-crd-plugin. Data will be fed into this plugin from annotation within external DNS endpoints. The CoreDNS - external DNS plugin will read these annotations and arrange itself accordingly.

One cluster can have a number of IP addresses that can change from time to time. The N% weight is set per cluster (region), so I need IP addresses X Region X weight. The annotation is string, so I can annotate by json:

[
  {region: "eu", weightPercent: 20, targets:["172.18.0.5","172.18.0.6"]}, 
  {region: "us", weightPercent:80, targets:["172.18.0.1","172.18.0.2"]}
]
apiVersion: externaldns.k8s.io/v1alpha1
kind: DNSEndpoint
metadata:
  annotations:
    k8gb.absa.oss/dnstype: local
    k8gb.absa.oss/weight-round-robin: '[{"region":"eu","weightPercent":20,"targets":["172.18.0.5","172.18.0.6"]},{"region":"us","weightPercent":80,"targets":["172.18.0.1","172.18.0.2"]}]'
  name: k8gb-ns-extdns
  namespace: k8gb

kuritka avatar Jun 06 '22 14:06 kuritka

@kuritka sorry if I'm coming late to the party, but I've tried to check settings for different implementations of WRR, and in most cases, WRR weights are provided as positive integer instead of percentage:

https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy-weighted.html https://cloud.google.com/dns/docs/zones/manage-routing-policies https://en.wikipedia.org/wiki/Weighted_round_robin https://en.wikipedia.org/wiki/SRV_record

The resulting percentage for a particular member can be calculated as a proportion of the total weight of all the members in the WRR group.

This way users also don't have to keep in mind the rule of having 100% of weights in total across n clusters.

somaritane avatar Jul 25 '22 21:07 somaritane

It's more about the point of view. The change is a little more extensive, but not complex. Currently it works with percentages and accept values values like "100", "100%",100,100%

But I can change it so that the results are equivalent

# current
weight:
  us: 50%
  eu: 30%
  za: 20
  uk: 0%

# new alternative (integers distribution) 
weight:
  us: 50
  eu: 30
  za: 20
  uk: 0

weight:
  us: 5
  eu: 3
  za: 2
  uk: 0

weight:
  us: 64
  eu: 38
  za: 26
  uk: 0

kuritka avatar Jul 26 '22 07:07 kuritka

@kuritka Yep, so my point is that by using percentages we're forcing users to think in % and keep the "100% in total" rule in mind. Whereas when we use just numbers and calculate resulting weights as a proportion of the total sum, we give more flexibility. This approach still allows using percentages as well, as you've shown in the examples above. And we don't have to provide a complex validation in this case, we might just limit the max weight to some sensible value, like 1000

somaritane avatar Jul 26 '22 08:07 somaritane

ok, will keep it consistent with route53 style and refactor k8gb controller to integers.

kuritka avatar Jul 26 '22 08:07 kuritka

@kuritka as WRR is implemented, can we close this issue?

ytsarev avatar Dec 15 '23 23:12 ytsarev

Hi @ytsarev , sure we can. I supposed it's already closed

kuritka avatar Dec 16 '23 09:12 kuritka

WRR was released in https://github.com/k8gb-io/k8gb/releases/tag/v0.11.1. Closing

ytsarev avatar Dec 17 '23 10:12 ytsarev