numaflow icon indicating copy to clipboard operation
numaflow copied to clipboard

Autoscaling did not scale though targetBufferAvailability < 90

Open vigith opened this issue 4 months ago • 5 comments

Summary

Autoscaling uses floor to compute % targetBufferAvailability causing autoscaling kick in to be on the slower side. This increases the tail latency.

Use Cases

  • improve tail latency

Notes

  scale:
    disabled: false
    lookbackSeconds: 120
    max: 50
    min: 1
    replicasPerScaleDown: 2
    replicasPerScaleUp: 4
    scaleDownCooldownSeconds: 90
    scaleUpCooldownSeconds: 90
    targetBufferAvailability: 90
    targetProcessingSeconds: 20
Image

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

vigith avatar Sep 04 '25 16:09 vigith

One solution we can think of for udf and sink:

  1. When there's backpressure, use tagetBufferAvailability for autoscaling;
  2. Otherwise use targetProcessingSeconds.

whynowy avatar Sep 04 '25 16:09 whynowy

We can not simply use ceiling, which might cause fusion reaction.

whynowy avatar Sep 04 '25 16:09 whynowy

I had brainstormed this initially and had considered few options which included the above as well.

  1. The switching between backpressure and individual rate based was one of them, but there are caveats attached to that approach consider the logic for that switch as well as ping pong for varying input loads, buffer limits etc. So wouldn't be a simple switch just on backpressure, We would need to run more simulations and iron out the math for a more smoother/effective scaling.

kohlisid avatar Sep 04 '25 20:09 kohlisid

I am thinking whether we have to include the processing speed. The goal is to minimize tail latency. We need to simulate and see how to get to a smoother curve.

vigith avatar Sep 04 '25 22:09 vigith

I am thinking whether we have to include the processing speed. The goal is to minimize tail latency. We need to simulate and see how to get to a smoother curve.

I had started on a hybrid math, which does account for something like drain rate per pod. And then also check what is the efficiency based on the prev scale

kohlisid avatar Sep 05 '25 18:09 kohlisid