functions icon indicating copy to clipboard operation
functions copied to clipboard

Smart Load Balancer - Route requests for same function to subset of machines

Open treeder opened this issue 9 years ago • 0 comments

In order to optimize various things such as:

  • reducing image pulls
  • reducing disk space for image cache
  • streaming inputs to running containers (#214) - hot functions

the load balancer will need to be smart about routing requests to a specific function to a subset of machines.

IronLB

The Problem

IronFunctions requires a load balancer to route requests to IronFunctions nodes. The problem is if we just use a regular load balancer, the requests will go to all the nodes which is very suboptimal since every machine will need to store all of the image functions, and we can't take advantage of hot/streaming containers.

The Solution

If we route requests for a particular function to a subset of machines, we get the following benefits:

  • reducing image pulls
  • reducing disk space for image cache
  • streaming inputs to running containers, AKA hot containers (#214)

See https://github.com/iron-io/functions/issues/151

We can extend an existing load balancer like Vulcand to solve the problem. At a minimum, the load balancer(s) should be able to route function X to a fixed set of nodes (say 3 by default).

lb-drawing

Usage

Like any other load balancer, user will start X number of IronLB nodes to route traffic to IronFunctions nodes. The logic to route traffic to specific nodes will be baked in so there shouldn't be much more configuration than telling the load balancers where to route traffic.

Will be delivered via a docker image.

High Level Implementation

The Docker image will start the LB and etcd.

  • For each request, get app_name and path (can be obtained from URL).
  • Check etcd for app_name.path, if exists, send traffic to one node from the list received. Else continue:
  • Use consistent hash or similar to route request based on app_name.path to MAX_NODES (3 default) nodes
    • How to consistent hash to multiple nodes?
  • Store node app_name.path -> set of IPs in etcd
  • Send traffic to one of the nodes.

Try to implement via a Vulcand middleware. Not sure if that's possible, I don't see a way to route to a specific server, will have to dig in. Otherwise, fork Vulcand and add this feature to the "Backend" that handles the servers.

Also consider https://github.com/containous/traefik instead of Vulcan.

First Deliverable

Working MVP for use with IronFunctions.

Future Improvements - not part of initial scope

  • Additional configuration may be how many nodes to route a functions traffic too, for instance, for really high load functions, you may want to say that a particular function can go to 10 nodes, instead of the default 3.
  • If we knew stats on particular routes, we could start with putting all requests to a function on one node and increase as traffic increases.

treeder avatar Oct 12 '16 18:10 treeder