cf-for-k8s Istio-Proxy sidecar resource requirements overly excessive

Describe the bug

The resource requirements configured for the Istio-Proxy sidecar for app instances seems to be a rather excessive, especially when compared to pushing a small golang app for example. In our environment this has caused apps to be unschedulable due to resource constraints from the K8s scheduler/nodes.

To Reproduce*

Steps to reproduce the behavior:

cf push my-small-golang-app -m 16m
kubectl -n cf-workloads describe pod/<app-instance-pod>

  opi:
    Limits:
      ephemeral-storage:  64M
      memory:             16M
    Requests:
      cpu:                10m
      ephemeral-storage:  64M
      memory:             16M

  istio-proxy:
    Limits:
      cpu:     2
      memory:  1Gi
    Requests:
      cpu:      100m
      memory:   128Mi

Expected behavior

Istio-Proxy should not have such excessive resource requests/limits set, when compared to an app that itself only requests 10m,16Mi itself.

cf-for-k8s SHA

https://github.com/cloudfoundry/cf-for-k8s/tree/7c65597af7a4de935994813658a5db182fbecac9

Cluster information

PKS

Sep 08 '20 15:09 JamesClonk

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/174710760

The labels on this github issue will be updated when the story is started.

Sep 08 '20 15:09 cf-gitbot

Hello @JamesClonk. Thanks for raising this issue.

We understand that you might have a small TKGI (PKS) cluster and that you only deploy apps with low memory usage, however, the sidecar proxies memory usage doesn't correlate with the memory usage of the app itself but rather traffic going to and from the app. So we cannot change that number based on the memory usage of the app.

Also, just to remind you that right now cf-for-k8s has some minimal system requirements:

To deploy cf-for-k8s as is, the cluster should:

be running Kubernetes version within range 1.16.x to 1.18.x

have a minimum of 5 nodes

have a minimum of 4 CPU, 15GB memory per node

You can read more about it in the deployment guide.

cc @kauana

Sep 08 '20 16:09 mike1808

Thanks @JamesClonk for submitting this issue and to @mike1808 and @kauana for your response.

Mike and Kauana, we have a couple of questions for you:

is this networking story related? Platform operators can configure Istio component resource properties
Would you recommend we keep this issue open for now or that we close it?

Sep 08 '20 17:09 jamespollard8

Hi @jamespollard8

Yes, it's related. We're going to allow operators to modify Istio resource request/limits.
Yes, let's keep this open and mark as a known issue.

Sep 08 '20 17:09 mike1808

I have some doubts that allowing the platform operator to configure resource requirements for sidecars globally will solve the. At least unless we have a foundation that is only hosting apps with very similar network traffic.

Has any conceptual work been started of how we can scale the Envoy according to the applications needs? I am aware that this will be far from trivial to solve and might even require work in kubernetes (first class sidecar) or istio (there at least had been ideas about how to decouple envoy from the application pods). Just curious if there have been any thoughts on this in the cf-k8s-networking team.

Sep 12 '20 15:09 loewenstein

Hello @loewenstein

We personally didn't perform any tests to validate resource requirements for sidecars and for now we're going to rely on the numbers from Istio documentaiton.

The Envoy proxy uses 0.5 vCPU and 50 MB memory per 1000 requests per second going through the proxy.

Istiod uses 1 vCPU and 1.5 GB of memory.

The Envoy proxy adds 2.76 ms to the 90th percentile latency.

Sep 14 '20 17:09 mike1808

I was just saying that "per 1000 requests" will not make for an easy platform wide configuration.

But I do understand that we currently don't have much of an option.

Sep 14 '20 19:09 loewenstein

@loewenstein we are going to make a doc with our recommendation (based on our testing). However, it is not prioritized right now.

Sep 14 '20 22:09 mike1808

The workaround would be to manually override the envoy memory requirements in your pod template:

    metadata:
      annotations:
        sidecar.istio.io/proxyCPU: "100m"
        sidecar.istio.io/proxyCPULimit: "1000m"
        sidecar.istio.io/proxyMemory: "1Gi"
        sidecar.istio.io/proxyMemoryLimit: "2Gi"

Apr 25 '22 08:04 nickb937

Hi @mike1808 , we are in a similar situation trying to investigate the right compute resources allocation to the envoy proxy sidecar, as the workloads increase, it's taking the cluster into the overcommitted state. Are there any recommendations, you have published in regards to resource allocation?

Jul 19 '22 05:07 Mohid-A

cf-for-k8s cf-for-k8s copied to clipboard

Istio-Proxy sidecar resource requirements overly excessive

Describe the bug

To Reproduce*

Expected behavior

cf-for-k8s SHA

Cluster information

cf-for-k8s
cf-for-k8s copied to clipboard