Autoscaling did not scale though targetBufferAvailability < 90
Summary
Autoscaling uses floor to compute % targetBufferAvailability causing autoscaling kick in to be on the slower side. This increases the tail latency.
Use Cases
- improve tail latency
Notes
scale:
disabled: false
lookbackSeconds: 120
max: 50
min: 1
replicasPerScaleDown: 2
replicasPerScaleUp: 4
scaleDownCooldownSeconds: 90
scaleUpCooldownSeconds: 90
targetBufferAvailability: 90
targetProcessingSeconds: 20
Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.
One solution we can think of for udf and sink:
- When there's backpressure, use
tagetBufferAvailabilityfor autoscaling; - Otherwise use
targetProcessingSeconds.
We can not simply use ceiling, which might cause fusion reaction.
I had brainstormed this initially and had considered few options which included the above as well.
- The switching between backpressure and individual rate based was one of them, but there are caveats attached to that approach consider the logic for that switch as well as ping pong for varying input loads, buffer limits etc. So wouldn't be a simple switch just on backpressure, We would need to run more simulations and iron out the math for a more smoother/effective scaling.
I am thinking whether we have to include the processing speed. The goal is to minimize tail latency. We need to simulate and see how to get to a smoother curve.
I am thinking whether we have to include the processing speed. The goal is to minimize tail latency. We need to simulate and see how to get to a smoother curve.
I had started on a hybrid math, which does account for something like drain rate per pod. And then also check what is the efficiency based on the prev scale