http-add-on icon indicating copy to clipboard operation
http-add-on copied to clipboard

With multiple triggers (CPU and HTTP) and minReplicaCount of 0, KEDA erroneously scales to 0.

Open mengland-noaa opened this issue 9 months ago • 8 comments

Report

With CPU and the http-external-scaler together as triggers in the same scaled object, the http scaler is superseding the CPU scaler. With CPU under heavy load and with http request(s) it scales up successfully, but KEDA subsequently intervenes and scales to 0 ignoring CPU.

Expected Behavior

Under heavy CPU load even with no http requests KEDA should not scale down to 0.

Actual Behavior

The HTTP add on appears to be overriding the CPU scaler.

Steps to Reproduce the Problem

  1. Create an nginx or other deployment paired with a CPU load test side car or init container. The memory scaler behaves similarly.
  2. Send an http request and watch as it initially scales up then scales back down to 0.
apiVersion: v1
kind: Service
metadata:
  name: my-service
  namespace: my-namespace
spec:
  selector:
    app: my-app
  type: ClusterIP
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
      name: http
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-deployment
  namespace: my-namespace
spec:
  replicas: 0
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: my-container
          image: nginx
          ports:
            - containerPort: 80
          resources:
            requests:
              cpu: "100m"
              memory: "100Mi"
            limits:
              cpu: "500m"
              memory: "100Mi"
        - name: stress-ng
          image: polinux/stress-ng:latest
          command: ["/bin/sh", "-c"]
          args:
            - "echo 'Running stress-ng'; stress-ng --cpu 1 --vm 1 --vm-bytes 64M --timeout 300s; echo 'stress-ng finished'; sleep 3600"
          resources:
            requests:
              cpu: "100m"
              memory: "100Mi"
            limits:
              cpu: "1000m"
              memory: "1000Mi"
---
kind: ScaledObject
apiVersion: keda.sh/v1alpha1
metadata:
  name: my-scaled-object
  namespace: my-namespace
spec:
  initialCooldownPeriod: 120
  cooldownPeriod: 30
  minReplicaCount: 0
  maxReplicaCount: 4
  pollingInterval: 5
  fallback:
    failureThreshold: 5
    replicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-deployment
  advanced:
    horizontalPodAutoscalerConfig:
      name: custom-hpa-name
      behavior:
        scaleDown:
          stabilizationWindowSeconds: 300
  triggers:
    - type: cpu
      name: cpu_trig
      metricType: Utilization
      metadata:
        value: "10"
    - type: external
      name: http_trig
      metadata:
        httpScaledObject: my-scaled-object
        hosts: "myhost"
        scalerAddress: keda-add-ons-http-external-scaler.keda:9090
---
kind: HTTPScaledObject
apiVersion: http.keda.sh/v1alpha1
metadata:
  name: my-scaled-object
  namespace: my-namespace
  annotations:
      httpscaledobject.keda.sh/skip-scaledobject-creation: "true"
spec:
  hosts:
  - "myhost"
  scalingMetric:
    requestRate:
      granularity: 1s
      targetValue: 2
      window: 1m
  scaledownPeriod: 300
  scaleTargetRef:
      name: my-deployment
      service: my-service
      port: 80
  replicas:
      min: 0
      max: 4
  targetPendingRequests: 1
---
kind: Service
apiVersion: v1
metadata:
  name: keda-add-ons-http-interceptor-proxy
  namespace: my-namespace
spec:
  type: ExternalName
  externalName: keda-add-ons-http-interceptor-proxy.keda.svc.cluster.local

Logs from KEDA HTTP operator

No response

HTTP Add-on Version

0.10.0

Kubernetes Version

None

Platform

Any

Anything else?

No response

mengland-noaa avatar Feb 26 '25 03:02 mengland-noaa

I have the same problem with HTTP scaler + Cron. The cron declares 1 replica for some interval, but if there aren't http request it seems the HTTP scaler is pushing 'deactivate' to KEDA and KEDA tries to scale to zero, and few milliseconds later is activating again due the cron. So the replica is constantly starting and terminating.

I'm not GO dev, but from

func (e *impl) IsActive(
	ctx context.Context,
	sor *externalscaler.ScaledObjectRef,
) (*externalscaler.IsActiveResponse, error) {
	lggr := e.lggr.WithName("IsActive")

	gmr, err := e.GetMetrics(ctx, &externalscaler.GetMetricsRequest{
		ScaledObjectRef: sor,
	})
	if err != nil {
		lggr.Error(err, "GetMetrics failed", "scaledObjectRef", sor.String())
		return nil, err
	}

	metricValues := gmr.GetMetricValues()
	if err := errors.New("len(metricValues) != 1"); len(metricValues) != 1 {
		lggr.Error(err, "invalid GetMetricsResponse", "scaledObjectRef", sor.String(), "getMetricsResponse", gmr.String())
		return nil, err
	}
	metricValue := metricValues[0].GetMetricValue()

	active := metricValue > 0
	res := &externalscaler.IsActiveResponse{
		Result: active,
	}
	return res, nil
}

and for the push

func (e *impl) StreamIsActive(
	scaledObject *externalscaler.ScaledObjectRef,
	server externalscaler.ExternalScaler_StreamIsActiveServer,
) error {
	// this function communicates with KEDA via the 'server' parameter.
	// we call server.Send (below) every streamInterval, which tells it to immediately
	// ping our IsActive RPC
	ticker := time.NewTicker(streamInterval)
	defer ticker.Stop()
	for {
		select {
		case <-server.Context().Done():
			return nil
		case <-ticker.C:
			active, err := e.IsActive(server.Context(), scaledObject)
			if err != nil {
				e.lggr.Error(
					err,
					"error getting active status in stream",
				)
				return err
			}
			err = server.Send(&externalscaler.IsActiveResponse{
				Result: active.Result,
			})
			if err != nil {
				e.lggr.Error(
					err,
					"error sending the active result in stream",
				)
				return err
			}
		}
	}
}

I get the feeling the http scaler will push the deactivation to KEDA, no matter what else active scalers there are in the ScaledObject, and this makes KEDA deactivating the workload briefly? Is this something to be fixed in KEDA itself, or in the http-add-on?

Given the example from Implementing StreamIsActive the external push scaler should not push active=false ever ?

StreamIsActive is calling IsActive, and the stream(push) should not return false, IsActive is probably fine to return false when called during polling, just to be clear :-)

rd-zahari-aleksiev avatar Mar 25 '25 13:03 rd-zahari-aleksiev

I think is the same issue -> https://github.com/kedacore/http-add-on/issues/1147

rd-zahari-aleksiev avatar Mar 29 '25 12:03 rd-zahari-aleksiev

@JorTurFer , what do you think, is my analysis makes sense? :-)

rd-zahari-aleksiev avatar Apr 02 '25 05:04 rd-zahari-aleksiev

Hi, just wanted to +1 this issue, I'm having the same difficulties making a cron trigger work with a http scaler. Version 0.10.0.

    - kind: HTTPScaledObject
      apiVersion: http.keda.sh/v1alpha1
      metadata:
        name: my-app
        annotations:
          httpscaledobject.keda.sh/skip-scaledobject-creation: "true"
      spec:
        hosts:
          - my-app.hello.world
        scaleTargetRef:
          name: my-app
          kind: Deployment
          apiVersion: apps/v1
          service: my-app
          port: 11434
        replicas:
          min: 0
          max: 3
        scaledownPeriod: 30
        scalingMetric:
          concurrency:
            targetValue: 10
    - kind: ScaledObject
      apiVersion: keda.sh/v1alpha1
      metadata:
        name: my-app
      spec:
        scaleTargetRef:
          apiVersion: apps/v1
          kind: Deployment
          name: my-app
        pollingInterval: 10
        cooldownPeriod: 30
        initialCooldownPeriod: 0
        minReplicaCount: 0
        maxReplicaCount: 3
        triggers:
          - type: cron
            metadata:
              timezone: Europe/Paris
              start: 0 9 * * 1-5
              end: 0 19 * * 1-5
              desiredReplicas: "1"
          - type: external-push
            metadata:
              httpScaledObject: my-app
              scalerAddress: keda-add-ons-http-external-scaler.keda:9090

A pod spawns and then is immediately shut down

leorniduv avatar May 02 '25 10:05 leorniduv

If anyone needs this I've found a very stupid way of making this setup work. You basically create another cron ScaledObject that kills the Http Add-on at the same time 🤦 I tested it and it seems to be doing the job: what needs to be killed outside the cron gets killed, and during the cron, my app is up and scales based on http traffic.

    - kind: HTTPScaledObject
      apiVersion: http.keda.sh/v1alpha1
      metadata:
        name: my-app-http
      spec:
        hosts:
          - bla.bla.foo
        scaleTargetRef:
          name: my-app
          kind: Deployment
          apiVersion: apps/v1
          service: my-app
          port: 3601
        replicas:
          min: 1 # changed from 0 to 1
          max: 3
        scaledownPeriod: 30
        scalingMetric:
          concurrency:
            targetValue: 10
    - kind: ScaledObject
      apiVersion: keda.sh/v1alpha1
      metadata:
        name: my-app-cron
      spec:
        scaleTargetRef:
          apiVersion: apps/v1
          kind: Deployment
          name: my-app
        pollingInterval: 10
        cooldownPeriod: 30
        initialCooldownPeriod: 0
        minReplicaCount: 0
        maxReplicaCount: 3
        triggers:
          - type: cron
            metadata:
              timezone: Europe/Paris
              start: 0 9 * * 1-5
              end: 0 19 * * 1-5
              desiredReplicas: "1"
  # New ScaledObject that kills the external-scaler on the same cron interval
    - kind: ScaledObject
      apiVersion: keda.sh/v1alpha1
      metadata:
        name: keda-external-scaler-cron
        namespace: keda
      spec:
        scaleTargetRef:
          apiVersion: apps/v1
          kind: Deployment
          name: keda-add-ons-http-external-scaler
        pollingInterval: 10
        cooldownPeriod: 30
        initialCooldownPeriod: 0
        minReplicaCount: 0
        maxReplicaCount: 3
        triggers:
          - type: cron
            metadata:
              timezone: Europe/Paris
              start: 0 9 * * 1-5
              end: 0 19 * * 1-5
              desiredReplicas: "1"

Again, this is very stupid and only works if you are using the Http add-on for a specific app (should probably deploy in namespace-mode and not cluster-wide now that I think of it ; to be a bit cleaner). + I'm not familiar with Go so I don't think I can pull off a PR to fix the real issue, this is just a hack to get it working :/

leorniduv avatar May 02 '25 14:05 leorniduv

I'm totally new to this project, so forgive any idiocy. I don't know exactly how this works, but I've read some docs and basing this on the comments in this issue. Bearing that in mind, doesn't this issue indicate a major design flaw?

My understanding of Keda ScaledObjects is that they use the different scalers to generate metrics that are then used by the HPA to actually do the scaling. From the comments above it sounds like the HTTPScaledObject isn't doing that, but is directly interacting with the ScaledObject to activate/deactivate it, essentially bypassing the keda mechanism. Wouldn't it be cleaner to just have the Interceptor emit a metric of the number of queued requests, then have a keda scaler that processed that metric?

I might be totally misunderstanding how things work, but I just wanted to chime in to check my understanding, as I'm really keen to use this project, but this issue would make it unusable for us.

gavinclarkeuk avatar May 23 '25 17:05 gavinclarkeuk

For me it is also something major, because teams are opting out from HTTP add-on because of this.. and we are using it only on non productive environments, since it is beta.

What I was able to see from the code, HTTPScaledObject is using the ScaledObject (managed from HTTPScaledObject or not) and through external push trigger is pushing activate/deactivate to KEDA, so it is NOT bypassing the keda mechanism. The problem from my point of view is that it should never emit active=false to KEDA. Only active=true when the interceptor detects the HTTP call.

Bear in mind we are talking about KEDA activate/deactivate phase. Pushing metrics to KEDA for the scaling phase is something different. HPA doesn't play any role for the activate/deactivate phase, since it cannot scale to zero, KEDA is doing that by manipulating Deployment.spec.replicas (if you are using Deployment as a k8s worload)

rd-zahari-aleksiev avatar May 23 '25 17:05 rd-zahari-aleksiev

Looking at your earlier comment I agree with the diagnosis. I'm also not a golang developer so I put Copilot to work and it suggested this implementation of StreamIsActive

func (e *impl) StreamIsActive(scaledObject *externalscaler.ScaledObjectRef, server externalscaler.ExternalScaler_StreamIsActiveServer) error {
    ticker := time.NewTicker(streamInterval)
    defer ticker.Stop()

    var prevActive bool

    for {
        select {
        case <-server.Context().Done():
            return nil
        case <-ticker.C:
            activeResp, err := e.IsActive(server.Context(), scaledObject)
            if err != nil {
                e.lggr.Error(err, "error getting active status in stream")
                return err
            }
            // Only send when transitioning from inactive to active
            if activeResp.Result && !prevActive {
                err = server.Send(&externalscaler.IsActiveResponse{
                    Result: true,
                })
                if err != nil {
                    e.lggr.Error(err, "error sending the active result in stream")
                    return err
                }
            }
            prevActive = activeResp.Result
        }
    }
}

However this still feels somewhat inefficient to me because of the ticker loop, which by default fires every 200ms. Instead it would be better to drive the response from an event coming from the interceptor when an http request is queued. That would require bigger design changes, as it would mean the Interceptor calling the Scaler, whereas right now the calls go the other way.

gavinclarkeuk avatar May 23 '25 18:05 gavinclarkeuk

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jul 25 '25 06:07 stale[bot]

this very irritating bug is still present, so pls do not stale the issue.

rd-zahari-aleksiev avatar Jul 25 '25 06:07 rd-zahari-aleksiev

This definitely feels like a bug +1

efernandes-dev-ops avatar Aug 01 '25 21:08 efernandes-dev-ops

@rd-zahari-aleksiev @mengland-noaa Would you be able to do some testing to see if satisfies your use case https://github.com/kedacore/http-add-on/pull/1321

efernandes-dev-ops avatar Aug 02 '25 09:08 efernandes-dev-ops

hmm, not sure if this bug is isolated to http-add-on, it seems like any external scaler in KEDA might suffer from this and as a result, it might be better fixed directly in KEDA

wozniakjan avatar Aug 07 '25 09:08 wozniakjan

@efernandes-dev-ops will be very hard for me to test it with the real setup we have in the cloud. :-( @wozniakjan indeed , I'm almost sure this external scaler has the same problem if used among others scalers. Still KEDA docs is showing pushing only activation in the docs/examples, so better to fix it here first.

rd-zahari-aleksiev avatar Aug 07 '25 11:08 rd-zahari-aleksiev

About CPU/Memory scaler + HTTP Add-on, this is the expected behaviour as CPU scaler can't be used for scaling from/to zero and it requires to work alongside another external metric (the HTTP one in this case) and it's the expected behaviour. This is because CPU isn't an external metric and it's not managed though KEDA.

About cron + HTTP, I agree with @wozniakjan , I think that this is a bug related with external-push scalers and how the activation is treated there.

The loop there is

	for _, ps := range cache.GetPushScalers() {
		go func(s scalers.PushScaler) {
			activeCh := make(chan bool)
			go s.Run(ctx, activeCh)
			for {
				select {
				case <-ctx.Done():
					return
				case active := <-activeCh:
					scalingMutex.Lock()
					switch obj := scalableObject.(type) {
					case *kedav1alpha1.ScaledObject:
						h.scaleExecutor.RequestScale(ctx, obj, active, false, &executor.ScaleExecutorOptions{})
					case *kedav1alpha1.ScaledJob:
						logger.Info("Warning: External Push Scaler does not support ScaledJob", "object", scalableObject)
					}
					scalingMutex.Unlock()
				}
			}
		}(ps)
	}

That's why as soon as the external push scaler reports active=false KEDA scales to zero the workload. I don't think that not reporting active=false is the solution because then KEDA won't scale to zero based on HTTP traffic but fixing this logic on KEDA side

JorTurFer avatar Aug 16 '25 19:08 JorTurFer

I've created this issue in the upstream repo -> https://github.com/kedacore/keda/issues/6986

JorTurFer avatar Aug 16 '25 20:08 JorTurFer

this will be fixed in KEDA 2.18 - https://github.com/kedacore/keda/issues/6986

wozniakjan avatar Oct 03 '25 09:10 wozniakjan