kubeless icon indicating copy to clipboard operation
kubeless copied to clipboard

Scale to Zero

Open jamding opened this issue 5 years ago • 11 comments

Is this a BUG REPORT or FEATURE REQUEST?:

Feature Request

What happened:

Current autoscaling option uses an HPA, but we have many low-traffic Functions. I'd trade off cold start latency for 0 resource usage under 0 load. I'd propose we turn this off by default.

Knative Serving addresses this but has a hard dependency on Istio which is not an option for my cluster.

What you expected to happen:

When Functions receive 0 traffic for some threshold (e.g. 180 seconds), the HPA pod count scales to 0. When a function has traffic, it defaults to the HPA or static RS behavior.

I'd propose this with a mechanism similar to knative's activator, where all 0-pod Function traffic is routed to an operator which receives the request, scales the Function to a non-0 number of pods, then forwards the original request accordingly. Obviously there will be increased latency on cold starts, and we can mitigate by respecting client timeouts and responding with QoS response codes if requests become a thundering herd.

jamding avatar Dec 17 '18 17:12 jamding

I probably won't have time to implement this for a couple weeks, but I was thinking of the following approach:

  • minimal prometheus deployment
  • introduce new CRD with same interface as function.kubeless.io, let's call scalingfunction.kubeless.io
  • new k8s operator which watches for idleness and marks functions for scale-to-zero
  • new k8s operator which can receive HTTP requests and marks functions for scale-up

This proposal introduces a couple elements that compose existing kubeless operators rather than to change them directly. Notably, there is no istio dependency.

Scale down:

  • idler operator watches all functions and using prometheus metrics already exposed on function runtimes, determines whether a function has been idle long enough for a scale down. If so, labels the scalingfunction deployment with the timestamp of the last request
  • 'activatoroperator watches for idle functions, then atomically changes a wrapping svc to point at theactivatoroperator instead of thefunction.kubeless.io`'s deployment svc
  • either during the svc swap or later some operator changes the function.kubeless.io spec to have a rs of 0 replicas

Scale up:

  • when activator receives an http request on behalf of a function, it marks the scalingfunction
  • the activator uses function.kubeless.io as a primitive and replaces the function.kubeless.io deployment corresponding to the scalingfunction deployment with a non-0 rs or hpa
  • activator received the initial http request, and then forwards the request to a function.kubeless.io svc when the function is ready

This proposal treats the kubeless function CRD as a primitive and proposes composing these scaling operators and CRDs with the existing kubeless ones.

The main disadvantage is no interoperability existing kubeless triggers since it requires having another svc wrap the one the kubeless function controller creates. One proposal is to make those trigger controllers "wrapper-aware"

jamding avatar Jan 07 '19 15:01 jamding

+1 for this.

Cold start using knative is pretty slow (around 6-8 seconds even with prepulled images etc) so it pretty much rules out using it for any user facing API or web endpoints which are called in a synchronous/blocking way.

If cold start on kubeless could be fast enough to satisfy that kind of use case it would be awesome..!

danielwhatmuff avatar Jan 08 '19 16:01 danielwhatmuff

Thanks for the write up @jamding, a couple of comments on your proposal:

  • I think we can avoid the prometheus dependency. Functions already expose a /metrics endpoint that the controller can call to retrieve the statistics needed.
  • We can also avoid having a new CRD, the new controller can watch HorizontalPodAutoscalers items and act if those hpa are associated to a function.

What I don't see so clear is how we can implement the HTTP interceptor. We would need to implement some kind of HTTP gateway to route requests to the function services (and scale up if needed). That can lead to higher response times and slow cold starts.

We can do a POC though to clarify how that may work.

andresmgot avatar Jan 09 '19 09:01 andresmgot

Yes +1 for this. Cold start would be perfect for some functions, I would love to run a tight cluster until it receives traffic. But even hpa functions for other things I know are dependencies.

CodeSwimBikeRunner avatar Feb 12 '19 22:02 CodeSwimBikeRunner

I think you should have a look at how other people implement scale-to-zero: https://github.com/deislabs/osiris. I stumbled on it in https://github.com/kedacore/keda where they suggest it as an alternative to knative serving.

But if you are worried about coldstart speed, you'd have to move towards a worker-pool.

reegnz avatar May 27 '19 15:05 reegnz

Hi! Any updates on this? In my opinion this is a must have for every serverless FaaS tool.

Have anyone considered developing a pool manager like Fission did, or even an idler like OpenFaas?

delucca avatar Mar 06 '20 22:03 delucca

I've noticed that this was merged on July 2019, and is part of Kubernetes release 1.16 https://github.com/kubernetes/kubernetes/pull/74526

It seems to not be GA and requires a command line flag HPAScaleToZero. It also does seem to require custom/external metrics (so I guess using CPU would not work), see https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/

I assume this could be used instead of requiring a custom implementation?

fernandrone avatar Jun 08 '20 20:06 fernandrone

Indeed, once Kubernetes supports it it would be trivial and we wouldn't need to add custom support for this. It would be great if someone could validate this and make it work with Kubeless.

andresmgot avatar Jun 09 '20 09:06 andresmgot

Do we have any update on this? I heard great things about Kubeless(https://www.appvia.io/blog/serverless-on-kubernetes) but this one issue was listed as a con with no activity.

santhoshsonti4 avatar Jun 28 '21 21:06 santhoshsonti4

Any updates on this?

Becavalier avatar Jul 01 '21 05:07 Becavalier

Any news on that?

eduard93 avatar Jul 16 '21 15:07 eduard93