csi-driver-host-path icon indicating copy to clipboard operation
csi-driver-host-path copied to clipboard

define resource limits to avoid eviction

Open pohly opened this issue 5 years ago • 14 comments

Pods without resource specification are the first that get evicted when a node runs out of resources. All of our deployments should specify required resources.

Perhaps there's also something else that can be done to prevent removal of a CSI driver instance from a node?

pohly avatar Apr 30 '19 14:04 pohly

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Jul 29 '19 15:07 fejta-bot

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

fejta-bot avatar Aug 28 '19 16:08 fejta-bot

/remove-lifecycle rotten

pohly avatar Aug 28 '19 17:08 pohly

In addition to resource requests, we should recommend a pod priority

As mentioned on the mailing list. An important DaemonSet should also have a blanket toleration.

DaemonSets that are system-critical are recommended to include a blanket-toleration. Such a pod will never be evicted by a taint.

For example,
$ k get ds -n kube-system kube-proxy -oyaml
apiVersion: extensions/v1beta1
kind: DaemonSet
#snip
spec:
#snip
      tolerations:
      - effect: NoExecute
        operator: Exists
      - effect: NoSchedule
        operator: Exists

misterikkit avatar Oct 04 '19 16:10 misterikkit

/help

msau42 avatar Oct 04 '19 18:10 msau42

@msau42: This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Oct 04 '19 18:10 k8s-ci-robot

if we deploy it under kube-system, we may use system-node-critical priority class directly.

cwdsuzhou avatar Jan 03 '20 03:01 cwdsuzhou

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Apr 02 '20 04:04 fejta-bot

/remove-lifecycle stale

pohly avatar Apr 02 '20 07:04 pohly

@pohly can we have generic resources limits of memory and cpu define for each yaml? or should i implement all resource limit of memory and cpu as default in yaml define in docs

Kartik494 avatar May 28 '20 04:05 Kartik494

What do you mean with "generic resource limits"?

I don't know what the recommended way of determining resource limits is. It probably implies running the pods and then measuring, but I don't know how or what.

pohly avatar May 28 '20 11:05 pohly

@pohly here i would like to know that is that any criteria for defining resources for each pod, example for csi-hostpath-attacher pod , what would be the resource limit for cpu and memory.

Kartik494 avatar May 29 '20 05:05 Kartik494

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Oct 15 '20 08:10 fejta-bot

/remove-lifecycle stale /lifecycle frozen

pohly avatar Oct 15 '20 14:10 pohly