http-add-on
http-add-on copied to clipboard
With multiple triggers (CPU and HTTP) and minReplicaCount of 0, KEDA erroneously scales to 0.
Report
With CPU and the http-external-scaler together as triggers in the same scaled object, the http scaler is superseding the CPU scaler. With CPU under heavy load and with http request(s) it scales up successfully, but KEDA subsequently intervenes and scales to 0 ignoring CPU.
Expected Behavior
Under heavy CPU load even with no http requests KEDA should not scale down to 0.
Actual Behavior
The HTTP add on appears to be overriding the CPU scaler.
Steps to Reproduce the Problem
- Create an nginx or other deployment paired with a CPU load test side car or init container. The memory scaler behaves similarly.
- Send an http request and watch as it initially scales up then scales back down to 0.
apiVersion: v1
kind: Service
metadata:
name: my-service
namespace: my-namespace
spec:
selector:
app: my-app
type: ClusterIP
ports:
- protocol: TCP
port: 80
targetPort: 80
name: http
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-deployment
namespace: my-namespace
spec:
replicas: 0
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-container
image: nginx
ports:
- containerPort: 80
resources:
requests:
cpu: "100m"
memory: "100Mi"
limits:
cpu: "500m"
memory: "100Mi"
- name: stress-ng
image: polinux/stress-ng:latest
command: ["/bin/sh", "-c"]
args:
- "echo 'Running stress-ng'; stress-ng --cpu 1 --vm 1 --vm-bytes 64M --timeout 300s; echo 'stress-ng finished'; sleep 3600"
resources:
requests:
cpu: "100m"
memory: "100Mi"
limits:
cpu: "1000m"
memory: "1000Mi"
---
kind: ScaledObject
apiVersion: keda.sh/v1alpha1
metadata:
name: my-scaled-object
namespace: my-namespace
spec:
initialCooldownPeriod: 120
cooldownPeriod: 30
minReplicaCount: 0
maxReplicaCount: 4
pollingInterval: 5
fallback:
failureThreshold: 5
replicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-deployment
advanced:
horizontalPodAutoscalerConfig:
name: custom-hpa-name
behavior:
scaleDown:
stabilizationWindowSeconds: 300
triggers:
- type: cpu
name: cpu_trig
metricType: Utilization
metadata:
value: "10"
- type: external
name: http_trig
metadata:
httpScaledObject: my-scaled-object
hosts: "myhost"
scalerAddress: keda-add-ons-http-external-scaler.keda:9090
---
kind: HTTPScaledObject
apiVersion: http.keda.sh/v1alpha1
metadata:
name: my-scaled-object
namespace: my-namespace
annotations:
httpscaledobject.keda.sh/skip-scaledobject-creation: "true"
spec:
hosts:
- "myhost"
scalingMetric:
requestRate:
granularity: 1s
targetValue: 2
window: 1m
scaledownPeriod: 300
scaleTargetRef:
name: my-deployment
service: my-service
port: 80
replicas:
min: 0
max: 4
targetPendingRequests: 1
---
kind: Service
apiVersion: v1
metadata:
name: keda-add-ons-http-interceptor-proxy
namespace: my-namespace
spec:
type: ExternalName
externalName: keda-add-ons-http-interceptor-proxy.keda.svc.cluster.local
Logs from KEDA HTTP operator
No response
HTTP Add-on Version
0.10.0
Kubernetes Version
None
Platform
Any
Anything else?
No response
I have the same problem with HTTP scaler + Cron. The cron declares 1 replica for some interval, but if there aren't http request it seems the HTTP scaler is pushing 'deactivate' to KEDA and KEDA tries to scale to zero, and few milliseconds later is activating again due the cron. So the replica is constantly starting and terminating.
I'm not GO dev, but from
func (e *impl) IsActive(
ctx context.Context,
sor *externalscaler.ScaledObjectRef,
) (*externalscaler.IsActiveResponse, error) {
lggr := e.lggr.WithName("IsActive")
gmr, err := e.GetMetrics(ctx, &externalscaler.GetMetricsRequest{
ScaledObjectRef: sor,
})
if err != nil {
lggr.Error(err, "GetMetrics failed", "scaledObjectRef", sor.String())
return nil, err
}
metricValues := gmr.GetMetricValues()
if err := errors.New("len(metricValues) != 1"); len(metricValues) != 1 {
lggr.Error(err, "invalid GetMetricsResponse", "scaledObjectRef", sor.String(), "getMetricsResponse", gmr.String())
return nil, err
}
metricValue := metricValues[0].GetMetricValue()
active := metricValue > 0
res := &externalscaler.IsActiveResponse{
Result: active,
}
return res, nil
}
and for the push
func (e *impl) StreamIsActive(
scaledObject *externalscaler.ScaledObjectRef,
server externalscaler.ExternalScaler_StreamIsActiveServer,
) error {
// this function communicates with KEDA via the 'server' parameter.
// we call server.Send (below) every streamInterval, which tells it to immediately
// ping our IsActive RPC
ticker := time.NewTicker(streamInterval)
defer ticker.Stop()
for {
select {
case <-server.Context().Done():
return nil
case <-ticker.C:
active, err := e.IsActive(server.Context(), scaledObject)
if err != nil {
e.lggr.Error(
err,
"error getting active status in stream",
)
return err
}
err = server.Send(&externalscaler.IsActiveResponse{
Result: active.Result,
})
if err != nil {
e.lggr.Error(
err,
"error sending the active result in stream",
)
return err
}
}
}
}
I get the feeling the http scaler will push the deactivation to KEDA, no matter what else active scalers there are in the ScaledObject, and this makes KEDA deactivating the workload briefly? Is this something to be fixed in KEDA itself, or in the http-add-on?
Given the example from Implementing StreamIsActive the external push scaler should not push active=false ever ?
StreamIsActive is calling IsActive, and the stream(push) should not return false, IsActive is probably fine to return false when called during polling, just to be clear :-)
I think is the same issue -> https://github.com/kedacore/http-add-on/issues/1147
@JorTurFer , what do you think, is my analysis makes sense? :-)
Hi, just wanted to +1 this issue, I'm having the same difficulties making a cron trigger work with a http scaler. Version 0.10.0.
- kind: HTTPScaledObject
apiVersion: http.keda.sh/v1alpha1
metadata:
name: my-app
annotations:
httpscaledobject.keda.sh/skip-scaledobject-creation: "true"
spec:
hosts:
- my-app.hello.world
scaleTargetRef:
name: my-app
kind: Deployment
apiVersion: apps/v1
service: my-app
port: 11434
replicas:
min: 0
max: 3
scaledownPeriod: 30
scalingMetric:
concurrency:
targetValue: 10
- kind: ScaledObject
apiVersion: keda.sh/v1alpha1
metadata:
name: my-app
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
pollingInterval: 10
cooldownPeriod: 30
initialCooldownPeriod: 0
minReplicaCount: 0
maxReplicaCount: 3
triggers:
- type: cron
metadata:
timezone: Europe/Paris
start: 0 9 * * 1-5
end: 0 19 * * 1-5
desiredReplicas: "1"
- type: external-push
metadata:
httpScaledObject: my-app
scalerAddress: keda-add-ons-http-external-scaler.keda:9090
A pod spawns and then is immediately shut down
If anyone needs this I've found a very stupid way of making this setup work. You basically create another cron ScaledObject that kills the Http Add-on at the same time 🤦 I tested it and it seems to be doing the job: what needs to be killed outside the cron gets killed, and during the cron, my app is up and scales based on http traffic.
- kind: HTTPScaledObject
apiVersion: http.keda.sh/v1alpha1
metadata:
name: my-app-http
spec:
hosts:
- bla.bla.foo
scaleTargetRef:
name: my-app
kind: Deployment
apiVersion: apps/v1
service: my-app
port: 3601
replicas:
min: 1 # changed from 0 to 1
max: 3
scaledownPeriod: 30
scalingMetric:
concurrency:
targetValue: 10
- kind: ScaledObject
apiVersion: keda.sh/v1alpha1
metadata:
name: my-app-cron
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
pollingInterval: 10
cooldownPeriod: 30
initialCooldownPeriod: 0
minReplicaCount: 0
maxReplicaCount: 3
triggers:
- type: cron
metadata:
timezone: Europe/Paris
start: 0 9 * * 1-5
end: 0 19 * * 1-5
desiredReplicas: "1"
# New ScaledObject that kills the external-scaler on the same cron interval
- kind: ScaledObject
apiVersion: keda.sh/v1alpha1
metadata:
name: keda-external-scaler-cron
namespace: keda
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: keda-add-ons-http-external-scaler
pollingInterval: 10
cooldownPeriod: 30
initialCooldownPeriod: 0
minReplicaCount: 0
maxReplicaCount: 3
triggers:
- type: cron
metadata:
timezone: Europe/Paris
start: 0 9 * * 1-5
end: 0 19 * * 1-5
desiredReplicas: "1"
Again, this is very stupid and only works if you are using the Http add-on for a specific app (should probably deploy in namespace-mode and not cluster-wide now that I think of it ; to be a bit cleaner). + I'm not familiar with Go so I don't think I can pull off a PR to fix the real issue, this is just a hack to get it working :/
I'm totally new to this project, so forgive any idiocy. I don't know exactly how this works, but I've read some docs and basing this on the comments in this issue. Bearing that in mind, doesn't this issue indicate a major design flaw?
My understanding of Keda ScaledObjects is that they use the different scalers to generate metrics that are then used by the HPA to actually do the scaling. From the comments above it sounds like the HTTPScaledObject isn't doing that, but is directly interacting with the ScaledObject to activate/deactivate it, essentially bypassing the keda mechanism. Wouldn't it be cleaner to just have the Interceptor emit a metric of the number of queued requests, then have a keda scaler that processed that metric?
I might be totally misunderstanding how things work, but I just wanted to chime in to check my understanding, as I'm really keen to use this project, but this issue would make it unusable for us.
For me it is also something major, because teams are opting out from HTTP add-on because of this.. and we are using it only on non productive environments, since it is beta.
What I was able to see from the code, HTTPScaledObject is using the ScaledObject (managed from HTTPScaledObject or not) and through external push trigger is pushing activate/deactivate to KEDA, so it is NOT bypassing the keda mechanism. The problem from my point of view is that it should never emit active=false to KEDA. Only active=true when the interceptor detects the HTTP call.
Bear in mind we are talking about KEDA activate/deactivate phase. Pushing metrics to KEDA for the scaling phase is something different. HPA doesn't play any role for the activate/deactivate phase, since it cannot scale to zero, KEDA is doing that by manipulating Deployment.spec.replicas (if you are using Deployment as a k8s worload)
Looking at your earlier comment I agree with the diagnosis. I'm also not a golang developer so I put Copilot to work and it suggested this implementation of StreamIsActive
func (e *impl) StreamIsActive(scaledObject *externalscaler.ScaledObjectRef, server externalscaler.ExternalScaler_StreamIsActiveServer) error {
ticker := time.NewTicker(streamInterval)
defer ticker.Stop()
var prevActive bool
for {
select {
case <-server.Context().Done():
return nil
case <-ticker.C:
activeResp, err := e.IsActive(server.Context(), scaledObject)
if err != nil {
e.lggr.Error(err, "error getting active status in stream")
return err
}
// Only send when transitioning from inactive to active
if activeResp.Result && !prevActive {
err = server.Send(&externalscaler.IsActiveResponse{
Result: true,
})
if err != nil {
e.lggr.Error(err, "error sending the active result in stream")
return err
}
}
prevActive = activeResp.Result
}
}
}
However this still feels somewhat inefficient to me because of the ticker loop, which by default fires every 200ms. Instead it would be better to drive the response from an event coming from the interceptor when an http request is queued. That would require bigger design changes, as it would mean the Interceptor calling the Scaler, whereas right now the calls go the other way.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
this very irritating bug is still present, so pls do not stale the issue.
This definitely feels like a bug +1
@rd-zahari-aleksiev @mengland-noaa Would you be able to do some testing to see if satisfies your use case https://github.com/kedacore/http-add-on/pull/1321
hmm, not sure if this bug is isolated to http-add-on, it seems like any external scaler in KEDA might suffer from this and as a result, it might be better fixed directly in KEDA
@efernandes-dev-ops will be very hard for me to test it with the real setup we have in the cloud. :-( @wozniakjan indeed , I'm almost sure this external scaler has the same problem if used among others scalers. Still KEDA docs is showing pushing only activation in the docs/examples, so better to fix it here first.
About CPU/Memory scaler + HTTP Add-on, this is the expected behaviour as CPU scaler can't be used for scaling from/to zero and it requires to work alongside another external metric (the HTTP one in this case) and it's the expected behaviour. This is because CPU isn't an external metric and it's not managed though KEDA.
About cron + HTTP, I agree with @wozniakjan , I think that this is a bug related with external-push scalers and how the activation is treated there.
The loop there is
for _, ps := range cache.GetPushScalers() {
go func(s scalers.PushScaler) {
activeCh := make(chan bool)
go s.Run(ctx, activeCh)
for {
select {
case <-ctx.Done():
return
case active := <-activeCh:
scalingMutex.Lock()
switch obj := scalableObject.(type) {
case *kedav1alpha1.ScaledObject:
h.scaleExecutor.RequestScale(ctx, obj, active, false, &executor.ScaleExecutorOptions{})
case *kedav1alpha1.ScaledJob:
logger.Info("Warning: External Push Scaler does not support ScaledJob", "object", scalableObject)
}
scalingMutex.Unlock()
}
}
}(ps)
}
That's why as soon as the external push scaler reports active=false KEDA scales to zero the workload. I don't think that not reporting active=false is the solution because then KEDA won't scale to zero based on HTTP traffic but fixing this logic on KEDA side
I've created this issue in the upstream repo -> https://github.com/kedacore/keda/issues/6986
this will be fixed in KEDA 2.18 - https://github.com/kedacore/keda/issues/6986