Use API v2 for HPA creation

Open apankratov opened this issue 4 years ago • 1 comments

FDDEP-0010: Switch API version for HPA object creation to v2beta2

Summary

Currently when HPA object is created - it uses API v1 that supports only very limited set of metrics that can be used to scale the deployment (pod CPU utilization and memory). We need to switch API version to v2 to allow users to use extended set of metrics (e.g. provided by prometheus) for deployment autoscaling.

Motivation

This change will allow to extend PAAS spec so users can scale their deployments based on relevant metrics.
It will solve the "sidecar problem" (i.e. for HPA v1 CPU utilization is an average across all containers within pod that may lead to improper scaling in case of sidecars that may lower the average pod CPU utilization value).

Goals

The goal is just switch HPA to v2 that will allow:

maintain backward compatibility, i.e. current PAAS spec for replicas will work exactly as before in sense of scaling logic
add possibility to scale on various resource metrics
allow extending FIAAS ecosystem with scaling based on business metrics

Non-Goals

add support for custom metrics
update PAAS spec

Proposal

User stories

Sidecar problem

As a user when I have a pod with sidecar(s) (for example datadog agent) and specify CPU threshold 75% for scaling - if datadog agent CPU usage is 0 while application pod CPU usage is 90 - average will be below 75 and scaling up won't happen.

Business metrics

As a user I want to scale my deployment based on relevant business metric (e.g. requests rate). With HPA v1 I just can't specify this metric.

⚠️ Caveats ⚠️

HPA API v2 was introduced in k8s version 1.9+, so for older cluster versions backward compatibility in case of just switch to API v2 will be broken.

Mitigations

There are several possible options of how to mitigate the caveat listed above:

We can just state in release notes that new version includes breaking changes and older k8s versions is no longer supported.
We can "branch" the development and make tagged releases
We can not just switch to API v2, but maintain backward compatibility by adding some cluster version detection feature that will use API v1 for older clusters and v2 for newer ones.

Design details

💡 Ideally we should probably switch to using official k8s client library instead of fiaas onne, but it will require too much refactoring that is out of the scope of this proposal.

We need to add v2 models in fiaas/k8s of autoscaling v2 objects
We need to update FDD autoscaler deployer to use new models
We need to support current paas.yaml spec so we need to add transformation of current metrics specification to v2 format, i.e. targetCPUUtilizationPercentage: XX should be transformed to v2 metrics spec:

metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: XX

Dependencies

https://github.com/fiaas/k8s

Implementation history

2020-10-13: Proposal open for discussion

Oct 13 '20 08:10 apankratov

We are struggling with the exact same thing. Too bad I didn't see this issue before now. Been trying to use the extension-hook-url to change the API version before it's deployed but it skips doing that for obvious reasons when I looked at the source code.

Jul 06 '22 13:07 arealmaas

fiaas-deploy-daemon fiaas-deploy-daemon copied to clipboard

Use API v2 for HPA creation

FDDEP-0010: Switch API version for HPA object creation to v2beta2

Summary

Motivation

Goals

Non-Goals

Proposal

User stories

Sidecar problem

Business metrics

⚠️ Caveats ⚠️

Mitigations

Design details

Dependencies

Implementation history

fiaas-deploy-daemon
fiaas-deploy-daemon copied to clipboard