kubespray icon indicating copy to clipboard operation
kubespray copied to clipboard

add two new options --max-requests-inflight and --max-mutating-requests-inflight to control-plane's default options

Open developer-guy opened this issue 6 months ago • 7 comments

What would you like to be added:

APF (API Priority and Fairness) is a mechanism to control the behavior of the Kubernetes API server in an overload situation and is a key task for cluster administrators. The kube-apiserver has some controls available (i.e. the --max-requests-inflight and --max-mutating-requests-inflight command-line flags) to limit the amount of outstanding work that will be accepted, preventing a flood of inbound requests from overloading and potentially crashing the API server.

So, that'd be great if we could customize these parameters in Kubespray. There is also --enable-priority-and-fairness flag that controls the behavior of enabling/disabling the APF feature. Maybe we could add another option for this.

Why is this needed:

To get more context about the feature: https://kubernetes.io/docs/concepts/cluster-administration/flow-control/

I can volunteer to do this!

developer-guy avatar Jan 04 '24 07:01 developer-guy

Note that you can use kube_kubeadm_apiserver_extra_args to pass additional configuration to kubeadm and thus to the apiserver. Is it insufficient ?

VannTen avatar Jan 04 '24 09:01 VannTen

I agree, but I thought we could create another group of variables that could be configured together in terms of APF features, for example, if you enable the APF feature we could give a reasonable amount of default values to these flags, also APF is enabled by default since 1.20, so for someone who might want to disable it, we could easily configure it by giving an on/off option without them knowing what is going on behind the scenes.

does that make sense?

developer-guy avatar Jan 04 '24 10:01 developer-guy

if you enable the APF feature we could give a reasonable amount of default values to these flags, also APF is enabled by default since 1.20 ...

I think the defaults bundled in the apiserver are reasonable enough, don't you ?

so for someone who might want to disable it, we could easily configure it by giving an on/off option without them knowing what is going on behind the scenes.

I don't see a usecase where you would disable APF without having at least some knowledge of how it works. If you're trying to debug the apiserver not responding fast enough to some requests, you certainly need to know something about APF.

VannTen avatar Jan 04 '24 10:01 VannTen

I think the defaults bundled in the apiserver are reasonable enough, don't you ?

great point though, I agree!

If you're trying to debug the apiserver not responding fast enough to some requests, you certainly need to know something about APF.

I agree! but we can still streamline passing these variables to kube-apiserver by eliminating knowing the flags.

developer-guy avatar Jan 04 '24 10:01 developer-guy

I agree! but we can still streamline passing these variables to kube-apiserver by eliminating knowing the flags.

I don't see the benefit. Knowing the flags or knowing the kubespray variables is pretty equivalent, and that's another thing we need to document + grow the template.

VannTen avatar Jan 05 '24 08:01 VannTen

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Apr 04 '24 08:04 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar May 04 '24 09:05 k8s-triage-robot