AKS icon indicating copy to clipboard operation
AKS copied to clipboard

[BUG] Cluster workloads to the kubernetes API server will intermittently timeout or takes minutes to complete

Open arsnyder16 opened this issue 2 years ago • 100 comments

Describe the bug Requests from cluster workloads to the kubernetes API server will intermittently timeout or takes minutes to complete, depending on the workloads request settings.

To Reproduce Steps to reproduce the behavior: Provision a new SLA enabled cluster

rg=
acr=
aks=
workspaceId=
location=northcentralus
aksVersion=1.22.6
az aks create \
  --resource-group $rg \
  --name $aks \
  --vm-set-type VirtualMachineScaleSets \
  --node-count 2 \
  --node-vm-size Standard_DS2_v2 \
  --node-osdisk-size 64 \
  --node-osdisk-type Ephemeral \
  --generate-ssh-keys \
  --kubernetes-version $aksVersion \
  --attach-acr $acr \
  --load-balancer-sku standard \
  --location $location \
  --enable-managed-identity \
  --uptime-sla \
  --enable-encryption-at-host \   
  --enable-addons monitoring \
  --workspace-resource-id $workspaceId

This may be optional but might help produce the problem. Install nginx ingress, and you can just leave the replicas as 0 to avoid adding any more noise to the cluster

helm upgrade --install nginx ingress-nginx/ingress-nginx \
  --create-namespace \
  --namespace ingress \
  --set controller.replicaCount=0 \
  --set controller.service.externalTrafficPolicy=Local \
  --wait

deploy a simple workload that just used kubectl to list the pods in a namespace, these jobs will fail once they detect the issue.

kubectl create namespace api-issue
kubectl apply -f - << EOF
apiVersion: batch/v1
kind: Job
metadata:
  name: slowrequest
  namespace: api-issue
spec:
  parallelism: 5
  completions: 5
  backoffLimit: 2
  template:  
    metadata:
      labels:
        app: slowrequest
    spec:
      restartPolicy: Never
      containers:
      - name: slowrequest
        image: bitnami/kubectl
        imagePullPolicy: IfNotPresent   
        command: 
          - /bin/sh 
        args: 
          - -c
          - set -e; while true; do kubectl get pods -n=default --selector=app=my-api --v=9 --output=json; sleep 260s; done
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: internal-kubectl-role
  namespace: api-issue
rules:
- apiGroups: [""]
  resources: ["pods", "pods/status"]
  verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: internal-kubectl-role-binding
subjects:
- kind: ServiceAccount
  name: default
  namespace: api-issue
roleRef:
  kind: ClusterRole
  name: internal-kubectl-role
  apiGroup: rbac.authorization.k8s.io
EOF

With the kubectl example above it will manifest in a timeout trying to do TLS handshake. What is strange about the kubectl log output is it does seem to have the response body, but it is show as a header.

Here is an example of succesful run

I0503 15:02:34.479797       7 round_trippers.go:466] curl -v -XGET  -H "Accept: application/json" -H "User-Agent: kubectl/v1.23.6 (linux/amd64) kubernetes/ad33385" -H "Authorization: Bearer <masked>" 'https://10.0.0.1:443/api/v1/namespaces/default/pods?labelSelector=app%3Dmy-api&limit=500'
I0503 15:02:34.504364       7 round_trippers.go:570] HTTP Statistics: GetConnection 0 ms ServerProcessing 24 ms Duration 24 ms
I0503 15:02:34.504383       7 round_trippers.go:577] Response Headers:
I0503 15:02:34.504390       7 round_trippers.go:580]     Date: Tue, 03 May 2022 15:02:34 GMT
I0503 15:02:34.508215       7 round_trippers.go:580]     Audit-Id: 0a8166f8-d538-4b02-8e1b-4fd050760999
I0503 15:02:34.508239       7 round_trippers.go:580]     Cache-Control: no-cache, private
I0503 15:02:34.508298       7 round_trippers.go:580]     Content-Type: application/json
I0503 15:02:34.508314       7 round_trippers.go:580]     X-Kubernetes-Pf-Flowschema-Uid: fae6b7ca-e682-4669-94eb-85163e201928
I0503 15:02:34.508319       7 round_trippers.go:580]     X-Kubernetes-Pf-Prioritylevel-Uid: 2a307f0a-f367-4f1d-ba5c-1bc4c330d0f1
I0503 15:02:34.508325       7 round_trippers.go:580]     Content-Length: 91
I0503 15:02:34.508374       7 request.go:1181] Response Body: {"kind":"PodList","apiVersion":"v1","metadata":{"resourceVersion":"251974943"},"items":[]}
{
    "apiVersion": "v1",
    "items": [],
    "kind": "List",
    "metadata": {
        "resourceVersion": "",
        "selfLink": ""
    }
}

Here is an example of one that fails just a few minutes later

I0503 15:06:54.560888      14 round_trippers.go:466] curl -v -XGET  -H "Accept: application/json" -H "User-Agent: kubectl/v1.23.6 (linux/amd64) kubernetes/ad33385" -H "Authorization: Bearer <masked>" 'https://10.0.0.1:443/api/v1/namespaces/default/pods?labelSelector=app%3Dmy-api&limit=500'
I0503 15:06:54.562933      14 round_trippers.go:510] HTTP Trace: Dial to tcp:10.0.0.1:443 succeed
I0503 15:07:04.564088      14 round_trippers.go:570] HTTP Statistics: DNSLookup 0 ms Dial 1 ms TLSHandshake 10000 ms Duration 10003 ms
I0503 15:07:04.564131      14 round_trippers.go:577] Response Headers:
{
    "apiVersion": "v1",
    "items": [],
    "kind": "List",
    "metadata": {
        "resourceVersion": "",
        "selfLink": ""
    }
}
I0503 15:07:04.565727      14 helpers.go:237] Connection error: Get https://10.0.0.1:443/api/v1/namespaces/default/pods?labelSelector=app%3Dmy-api&limit=500: net/http: TLS handshake timeout
F0503 15:07:04.566119      14 helpers.go:118] Unable to connect to the server: net/http: TLS handshake timeout
goroutine 1 [running]:
k8s.io/kubernetes/vendor/k8s.io/klog/v2.stacks(0x1)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1038 +0x8a
k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).output(0x3080040, 0x3, 0x0, 0xc000620000, 0x2, {0x25f2ec7, 0x10}, 0xc00005c400, 0x0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:987 +0x5fd
k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).printDepth(0xc0002e1a40, 0x40, 0x0, {0x0, 0x0}, 0x54, {0xc00032b470, 0x1, 0x1})
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:735 +0x1ae
k8s.io/kubernetes/vendor/k8s.io/klog/v2.FatalDepth(...)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1518
k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util.fatal({0xc0002e1a40, 0x40}, 0xc0007664e0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util/helpers.go:96 +0xc5
k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util.checkErr({0x1fed760, 0xc0007664e0}, 0x1e797d0)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util/helpers.go:191 +0x7d7
k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util.CheckErr(...)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util/helpers.go:118
k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/get.NewCmdGet.func2(0xc0003e8780, {0xc000494460, 0x1, 0x5})
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/get/get.go:181 +0xc8
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute(0xc0003e8780, {0xc000494410, 0x5, 0x5})
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:860 +0x5f8
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc00056cf00)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:974 +0x3bc
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute(...)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:902
k8s.io/kubernetes/vendor/k8s.io/component-base/cli.run(0xc00056cf00)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/component-base/cli/run.go:146 +0x325
k8s.io/kubernetes/vendor/k8s.io/component-base/cli.RunNoErrOutput(...)
	/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/component-base/cli/run.go:84
main.main()
	_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubectl/kubectl.go:30 +0x1e

I experience different behaviors with different clients. For example I have a simple nodejs app that does the same thing by just listing the pods through the k8s sdk. In this environment i will get situations where the requests will take upwards of 5 minutes to complete

'use strict';
const process = require('process');
const k8s = require('@kubernetes/client-node');
const kc = new k8s.KubeConfig();
kc.loadFromCluster();
const k8Core = kc.makeApiClient(k8s.CoreV1Api);
let startTime; // custom log offset to help correlate with tcp dump
const log = msg => console.log(`${(new Date() - startTime) / 1000.0} ${msg}`);
let running;
let interval;
const listPods = async ()=>{
  if (running) {
    return;
  }
  running = true;
  log('Listing pods...');
  const listStart = new Date();
  const { body: { items } } = await k8Core.listNamespacedPod('default');
  const seconds = (new Date() - listStart) / 1000.0;
  log(`Found ${items.length} pods in ${seconds} seconds`);
  if(seconds > 60) {
    log(`Closing because this seems excessive`);
    process.exitCode = -1;
    clearInterval(interval);
    return;
  }
  running = false;
};
setTimeout(()=>{
  startTime = new Date();
  listPods();
  interval = setInterval(listPods, 215 * 1000);
}, 1000)

2022-06-28T02:04:30.490394199Z 36564.257 Listing pods...
2022-06-28T02:04:30.525501046Z 36564.292 Found 0 pods in 0.035 seconds
2022-06-28T02:08:05.550325353Z 36779.317 Listing pods...
2022-06-28T02:13:37.847998556Z 37111.614 Found 0 pods in 332.297 seconds  <--- Not great
2022-06-28T02:13:37.848028156Z 37111.614 Closing because this seems excessive

Expected behavior Requests should be able to complete in a reasonable amount of time. I see this across many clusters some times every few minutes. To eliminate all cluster specific variables, this is a bare bones replication of the issue, so should not suffer from user workloads effecting performance.

Environment (please complete the following information):

  • CLI Version [e.g. 3.22]
  • Kubernetes version >1.21.x

Additional context This seemed to start once we upgraded clusters from 1.20 to 1.21. I first opened a request ticket with support in January, but it has since been in the support death spiral and has gotten no where an yet to reached a team able to diagnosis or even attempt to reproduce with the simple steps above. I have sent tcpdumps, kubelet logs, etc

This is not specific to any requests we see it across many different requests. We have various workloads that may monitor the cluster using the API or dynamically create or modify workloads through the API.

Have yet been able to reproduce outside of the cluster seems to be very specific to cluster to control plane communication

This only seems to be a problem on SLA enabled clusters. Openvpn, aks-link issue? I don't see any recycling of aks-link or anything useful in the logs.

I am really curious if Konnectivity resolves the problem, but i have yet to see it make to any of my various clusters which are across many different data centers.

arsnyder16 avatar Jun 30 '22 13:06 arsnyder16