krr
krr copied to clipboard
[Feature] Add support Horizontal Pod Autoscaling
Hello! Please add support Horizontal Pod Autoscaling I view current output:
│ 1.0 -> ? (No data) │ 1.0 -> ? (No data) │ │ 100Mi -> ? (No data) │ 2000Mi -> ? (No
┼────────────────────────┼─────────────────────────┼─────────────┼────────────────────────┼──────────────────
│ 100m -> ? (No data) │ 100m -> ? (No data) │ │ 100Mi -> ? (No data) │ 100Mi -> ? (No
┼────────────────────────┼─────────────────────────┼─────────────┼────────────────────────┼──────────────────
│ 100m -> ? (No data) │ 100m -> ? (No data) │ │ 256Mi -> ? (No data) │ 256Mi -> ? (No
┼────────────────────────┼─────────────────────────┼─────────────┼────────────────────────┼──────────────────
│ 1.0 -> ? (No data) │ 1.0 -> ? (No data) │ │ 256Mi -> ? (No data) │ 256Mi -> ? (No
┼────────────────────────┼─────────────────────────┼─────────────┼────────────────────────┼──────────────────
│ 1.0 -> ? (No data) │ 1.0 -> ? (No data) │ │ 100Mi -> ? (No data) │ 2000Mi -> ? (No
┼────────────────────────┼─────────────────────────┼─────────────┼────────────────────────┼──────────────────
│ 1.0 -> ? (No data) │ 1.0 -> ? (No data) │ │ 256Mi -> ? (No data) │ 256Mi -> ? (No
┼────────────────────────┼─────────────────────────┼─────────────┼────────────────────────┼──────────────────
│ 100m -> ? (No data) │ 100m -> ? (No data) │ │ 100Mi -> ? (No data) │ 100Mi -> ? (No
┼────────────────────────┼─────────────────────────┼─────────────┼────────────────────────┼──────────────────
│ 100m -> ? (No data) │ 500m -> ? (No data) │ │ 100Mi -> ? (No data) │ 500Mi -> ? (No
┼────────────────────────┼─────────────────────────┼─────────────┼────────────────────────┼──────────────────
│ 100m -> ? (No data) │ 100m -> ? (No data) │ │ 100Mi -> ? (No data) │ 100Mi -> ? (No
┼────────────────────────┼─────────────────────────┼─────────────┼────────────────────────┼──────────────────
│ 100m -> ? (No data) │ 100m -> ? (No data) │ │ 100Mi -> ? (No data) │ 100Mi -> ? (No
┼────────────────────────┼─────────────────────────┼─────────────┼────────────────────────┼──────────────────
│ 100m -> ? (No data) │ 500m -> ? (No data) │ │ 100Mi -> ? (No data) │ 500Mi -> ? (No
┼────────────────────────┼─────────────────────────┼─────────────┼────────────────────────┼──────────────────
│ 500m -> ? (No data) │ 1.0 -> ? (No data) │ │ 500Mi -> ? (No data) │ 1500Mi -> ? (No
┼────────────────────────┼─────────────────────────┼─────────────┼────────────────────────┼──────────────────
│ 500m -> ? (No data) │ 1.0 -> ? (No data) │ │ 1000Mi -> ? (No data) │ 1500Mi -> ? (No
┼────────────────────────┼─────────────────────────┼─────────────┼────────────────────────┼──────────────────
│ 100m -> ? (No data) │ 500m -> ? (No data) │ │ 100Mi -> ? (No data) │ 500Mi -> ? (No
┼────────────────────────┼─────────────────────────┼─────────────┼────────────────────────┼──────────────────
│ 100m -> ? (No data) │ 500m -> ? (No data) │ │ 100Mi -> ? (Not enough │ 500Mi -> ? (Not enough
Hey all, we're still figuring out the appropriate algorithm to use in this case. (The usual logic doesn't work if you're scaling according to CPU/memory utilization.)
To help, we'd love to hear from each of you:
- How is your HPA defined? What metric do you use to scale?
- What logic would make the most sense for recommendations when using the HPA?
I use standart request CPU and request Memory
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
annotations:
meta.helm.sh/release-name: application
meta.helm.sh/release-namespace: application
creationTimestamp: "2023-07-24T19:28:02Z"
labels:
app-name: application
app-version: release-f469738c
app.kubernetes.io/managed-by: Helm
name: application
namespace: application
resourceVersion: "249236825"
uid: 15d78a48-a594-4908-8723-82ba7291f6b0
spec:
behavior:
scaleDown:
policies:
- periodSeconds: 60
type: Pods
value: 4
- periodSeconds: 60
type: Percent
value: 25
selectPolicy: Max
scaleUp:
policies:
- periodSeconds: 60
type: Pods
value: 4
- periodSeconds: 60
type: Percent
value: 25
selectPolicy: Max
stabilizationWindowSeconds: 0
maxReplicas: 25
metrics:
- resource:
name: memory
target:
averageUtilization: 80
type: Utilization
type: Resource
- resource:
name: cpu
target:
averageUtilization: 80
type: Utilization
type: Resource
minReplicas: 6
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: application
status:
conditions:
- lastTransitionTime: "2023-07-24T19:28:17Z"
message: recommended size matches current size
reason: ReadyForNewScale
status: "True"
type: AbleToScale
- lastTransitionTime: "2023-11-16T19:17:09Z"
message: the HPA was able to successfully calculate a replica count from memory
resource utilization (percentage of request)
reason: ValidMetricFound
status: "True"
type: ScalingActive
- lastTransitionTime: "2023-11-16T18:59:50Z"
message: the desired replica count is less than the minimum replica count
reason: TooFewReplicas
status: "True"
type: ScalingLimited
currentMetrics:
- resource:
current:
averageUtilization: 59
averageValue: 457606485333m
name: memory
type: Resource
- resource:
current:
averageUtilization: 18
averageValue: 202m
name: cpu
type: Resource
currentReplicas: 6
desiredReplicas: 6
lastScaleTime: "2023-11-16T18:04:52Z"
Thanks. In a case like this, we can't just use the standard KRR strategy, because it leads to unstable recommendations.
E.g. if the HPA sets target.averageUtilization: 80 for CPU, but KRR is outputing recommendations for CPU at the 99th percentile, then KRR's recommendations can potentially cause HPA to always scale and HPA's behaviour can cause KRR's recommendations to then change.
We likely need a strategy that takes both into account in the first place.
Hi, instead of providing recommendations for deployment, where HPA is enabled is it able to provide recommendations based on each pod? Almost every pod in HPA is getting same memory and CPU utilization
Hello @aantn, Is there any plan to add support for HPA resources?
Yep, the tricky part is - as always - what the algorithm should be.
We have some ideas for the algorithm, but for starters we're going to add an --allow-hpa flag so that you can always say on the current algorithm "give me a recommendation even though the pod uses the HPA, it's OK with me".
And if you have inputs on what an HPA-optimized algorithm looks like, I want to chat!
Thank you for your prompt response, having the flag is a great start in my opinion.
Of course. We'll have a PR fairly soon.
Great stuff @aantn, what are the plans & timeline for releasing it?
Can you try --allow_hpa on #226?
Hello @aantn,
I have tested the --allow_hpa flag and now I can see CPU & Memory recommendations for the workloads that we have HPA configured for.
Excellent. Happy to hear it!
Thank you @aantn for the quick implementation
Which version does the allow_hpa function work with?
Only on the branch in #226. Until we merge and do a release, you'll have to check it out locally and run from source, according to the instructions in the README.
Only on the branch in #226. Until we merge and do a release, you'll have to check it out locally and run from source, according to the instructions in the README.
how can I download a binary from this branch without downloading the entire project?
It's not possible at the moment. You need to checkout the whole project and follow the from source instructions here: https://github.com/robusta-dev/krr?tab=readme-ov-file#installation-methods
It's not possible at the moment. You need to checkout the whole project and follow the from source instructions here: https://github.com/robusta-dev/krr?tab=readme-ov-file#installation-methods
I'll wait for the new release.
May be create alpha release for test hpa?
This is included in the latest release! Let me know if it works for you.