custom-pod-autoscaler icon indicating copy to clipboard operation
custom-pod-autoscaler copied to clipboard

Metrics with type http brings the whole deployment config as query

Open lyudmilalala opened this issue 9 months ago • 2 comments

Kubernetes Details (kubectl version):

K8s in Docker on Windows 10 K8s version v1.27.2

Bug & Reproduce

I try the get metrics by http request example.

It works properly if everything are unchanged. However, if I add more features into the deployment.yaml, make it as below

apiVersion: apps/v1
kind: Deployment
metadata:
  name: as-test-deploy
  namespace: as-test-ns
spec:
  replicas: 1
  selector:
    matchLabels:
      app: hello-kubernetes
  template:
    metadata:
      labels:
        app: hello-kubernetes
    spec:
      containers:
      - name: hello-kubernetes
        image: paulbouwer/hello-kubernetes:1.5
        ports:
        - containerPort: 8080
          protocol: TCP
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        resources:
          requests:
            memory: "600Mi"
          limits:
            memory: "600Mi"
        readinessProbe:
          tcpSocket:
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
          failureThreshold: 3

CPA will give out the error

E0919 06:15:47.113456       1 main.go:277] HTTP request failed, status: [414], response: '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>414 Request-URI Too Long</title>
</head><body>
<h1>Request-URI Too Long</h1>
<p>The requested URL's length exceeds the capacity
limit for this server.<br />
</p>
<hr>
<address>Apache/2.4.25 (Debian) Server at w2a.random.org Port 443</address>        
</body></html>
'

I think it is because the deployment configuration is appended as a query parameter of the request. This conjecture can be proved by another line of error log mentioned below.

Currently, for workaround, I write a python shell script, and send requests there.

Expected behavior

The deployment.yaml used in production will probably be longer than the one I list above, so I think we should avoid to take it in query, or at least give users a choice not to send it by query.

Additional context - The other error log

If everything are unchanged, the following error will be printed after running CPA for around 10 minutes. I think it is caused by frequestly request to https://www.random.org/integers/, and won't regard it as a client-side issue.

E0919 05:49:39.365893       1 main.go:277] Get "https://www.random.org/integers/?base=10&col=1&format=plain&max=5&min=1&num=1&rnd=new&value=%7B%22resource%22%3A%7B%22kind%22%3A%22Deployment%22%2C%22apiVersion%22%3A%22apps%2Fv1%22%2C%22metadata%22%3A%7B%22name%22%3A%22as-test-deploy%22%2C%22namespace%22%3A%22as-test-ns%22%2C%22uid%22%3A%2231b899b6-978e-4cb6-89ec-a20d7a7a1651%22%2C%22resourceVersion%22%3A%222092030%22%2C%22generation%22%3A3%2C%22creationTimestamp%22%3A%222023-09-19T05%3A47%3A14Z%22%2C%22annotations%22%3A%7B%22deployment.kubernetes.io%2Frevision%22%3A%223%22%2C%22kubectl.kubernetes.io%2Flast-applied-configuration%22%3A%22%7B%5C%22apiVersion%5C%22%3A%5C%22apps%2Fv1%5C%22%2C%5C%22kind%5C%22%3A%5C%22Deployment%5C%22%2C%5C%22metadata%5C%22%3A%7B%5C%22annotations%5C%22%3A%7B%7D%2C%5C%22name%5C%22%3A%5C%22as-test-deploy%5C%22%2C%5C%22namespace%5C%22%3A%5C%22as-test-ns%5C%22%7D%2C%5C%22spec%5C%22%3A%7B%5C%22replicas%5C%22%3A1%2C%5C%22selector%5C%22%3A%7B%5C%22matchLabels%5C%22%3A%7B%5C%22app%5C%22%3A%5C%22as-test-app%5C%22%7D%7D%2C%5C%22template%5C%22%3A%7B%5C%22metadata%5C%22%3A%7B%5C%22annotations%5C%22%3A%7B%5C%22controller.kubernetes.io%2Fpod-deletion-cost%5C%22%3A%5C%221%5C%22%7D%2C%5C%22labels%5C%22%3A%7B%5C%22app%5C%22%3A%5C%22as-test-app%5C%22%7D%7D%2C%5C%22spec%5C%22%3A%7B%5C%22containers%5C%22%3A%5B%7B%5C%22env%5C%22%3A%5B%7B%5C%22name%5C%22%3A%5C%22POD_NAME%5C%22%2C%5C%22valueFrom%5C%22%3A%7B%5C%22fieldRef%5C%22%3A%7B%5C%22fieldPath%5C%22%3A%5C%22metadata.name%5C%22%7D%7D%7D%2C%7B%5C%22name%5C%22%3A%5C%22POD_NAMESPACE%5C%22%2C%5C%22valueFrom%5C%22%3A%7B%5C%22fieldRef%5C%22%3A%7B%5C%22fieldPath%5C%22%3A%5C%22metadata.namespace%5C%22%7D%7D%7D%5D%2C%5C%22image%5C%22%3A%5C%22lyudmilalala%2Fflask-test-img%3A1.0.0%5C%22%2C%5C%22imagePullPolicy%5C%22%3A%5C%22IfNotPresent%5C%22%2C%5C%22name%5C%22%3A%5C%22as-test-pod%5C%22%2C%5C%22ports%5C%22%3A%5B%7B%5C%22containerPort%5C%22%3A8080%7D%5D%7D%5D%2C%5C%22serviceAccountName%5C%22%3A%5C%22app-admin%5C%22%7D%7D%7D%7D%5Cn%22%7D%2C%22managedFields%22%3A%5B%7B%22manager%22%3A%22kubectl-client-side-apply%22%2C%22operation%22%3A%22Update%22%2C%22apiVersion%22%3A%22apps%2Fv1%22%2C%22time%22%3A%222023-09-19T05%3A49%3A19Z%22%2C%22fieldsType%22%3A%22FieldsV1%22%2C%22fieldsV1%22%3A%7B%22f%3Ametadata%22%3A%7B%22f%3Aannotations%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3Akubectl.kubernetes.io%2Flast-applied-configuration%22%3A%7B%7D%7D%7D%2C%22f%3Aspec%22%3A%7B%22f%3AprogressDeadlineSeconds%22%3A%7B%7D%2C%22f%3Areplicas%22%3A%7B%7D%2C%22f%3ArevisionHistoryLimit%22%3A%7B%7D%2C%22f%3Aselector%22%3A%7B%7D%2C%22f%3Astrategy%22%3A%7B%22f%3ArollingUpdate%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3AmaxSurge%22%3A%7B%7D%2C%22f%3AmaxUnavailable%22%3A%7B%7D%7D%2C%22f%3Atype%22%3A%7B%7D%7D%2C%22f%3Atemplate%22%3A%7B%22f%3Ametadata%22%3A%7B%22f%3Aannotations%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3Acontroller.kubernetes.io%2Fpod-deletion-cost%22%3A%7B%7D%7D%2C%22f%3Alabels%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3Aapp%22%3A%7B%7D%7D%7D%2C%22f%3Aspec%22%3A%7B%22f%3Acontainers%22%3A%7B%22k%3A%7B%5C%22name%5C%22%3A%5C%22as-test-pod%5C%22%7D%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3Aenv%22%3A%7B%22.%22%3A%7B%7D%2C%22k%3A%7B%5C%22name%5C%22%3A%5C%22POD_NAME%5C%22%7D%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3Aname%22%3A%7B%7D%2C%22f%3AvalueFrom%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3AfieldRef%22%3A%7B%7D%7D%7D%2C%22k%3A%7B%5C%22name%5C%22%3A%5C%22POD_NAMESPACE%5C%22%7D%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3Aname%22%3A%7B%7D%2C%22f%3AvalueFrom%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3AfieldRef%22%3A%7B%7D%7D%7D%7D%2C%22f%3Aimage%22%3A%7B%7D%2C%22f%3AimagePullPolicy%22%3A%7B%7D%2C%22f%3Aname%22%3A%7B%7D%2C%22f%3Aports%22%3A%7B%22.%22%3A%7B%7D%2C%22k%3A%7B%5C%22containerPort%5C%22%3A8080%2C%5C%22protocol%5C%22%3A%5C%22TCP%5C%22%7D%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3AcontainerPort%22%3A%7B%7D%2C%22f%3Aprotocol%22%3A%7B%7D%7D%7D%2C%22f%3Aresources%22%3A%7B%7D%2C%22f%3AterminationMessagePath%22%3A%7B%7D%2C%22f%3AterminationMessagePolicy%22%3A%7B%7D%7D%7D%2C%22f%3AdnsPolicy%22%3A%7B%7D%2C%22f%3ArestartPolicy%22%3A%7B%7D%2C%22f%3AschedulerName%22%3A%7B%7D%2C%22f%3AsecurityContext%22%3A%7B%7D%2C%22f%3AserviceAccount%22%3A%7B%7D%2C%22f%3AserviceAccountName%22%3A%7B%7D%2C%22f%3AterminationGracePeriodSeconds%22%3A%7B%7D%7D%7D%7D%7D%7D%2C%7B%22manager%22%3A%22kube-controller-manager%22%2C%22operation%22%3A%22Update%22%2C%22apiVersion%22%3A%22apps%2Fv1%22%2C%22time%22%3A%222023-09-19T05%3A49%3A20Z%22%2C%22fieldsType%22%3A%22FieldsV1%22%2C%22fieldsV1%22%3A%7B%22f%3Ametadata%22%3A%7B%22f%3Aannotations%22%3A%7B%22f%3Adeployment.kubernetes.io%2Frevision%22%3A%7B%7D%7D%7D%2C%22f%3Astatus%22%3A%7B%22f%3AavailableReplicas%22%3A%7B%7D%2C%22f%3Aconditions%22%3A%7B%22.%22%3A%7B%7D%2C%22k%3A%7B%5C%22type%5C%22%3A%5C%22Available%5C%22%7D%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3AlastTransitionTime%22%3A%7B%7D%2C%22f%3AlastUpdateTime%22%3A%7B%7D%2C%22f%3Amessage%22%3A%7B%7D%2C%22f%3Areason%22%3A%7B%7D%2C%22f%3Astatus%22%3A%7B%7D%2C%22f%3Atype%22%3A%7B%7D%7D%2C%22k%3A%7B%5C%22type%5C%22%3A%5C%22Progressing%5C%22%7D%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3AlastTransitionTime%22%3A%7B%7D%2C%22f%3AlastUpdateTime%22%3A%7B%7D%2C%22f%3Amessage%22%3A%7B%7D%2C%22f%3Areason%22%3A%7B%7D%2C%22f%3Astatus%22%3A%7B%7D%2C%22f%3Atype%22%3A%7B%7D%7D%7D%2C%22f%3AobservedGeneration%22%3A%7B%7D%2C%22f%3AreadyReplicas%22%3A%7B%7D%2C%22f%3Areplicas%22%3A%7B%7D%2C%22f%3AupdatedReplicas%22%3A%7B%7D%7D%7D%7D%5D%7D%2C%22spec%22%3A%7B%22replicas%22%3A1%2C%22selector%22%3A%7B%22matchLabels%22%3A%7B%22app%22%3A%22as-test-app%22%7D%7D%2C%22template%22%3A%7B%22metadata%22%3A%7B%22creationTimestamp%22%3Anull%2C%22labels%22%3A%7B%22app%22%3A%22as-test-app%22%7D%2C%22annotations%22%3A%7B%22controller.kubernetes.io%2Fpod-deletion-cost%22%3A%221%22%7D%7D%2C%22spec%22%3A%7B%22containers%22%3A%5B%7B%22name%22%3A%22as-test-pod%22%2C%22image%22%3A%22lyudmilalala%2Fflask-test-img%3A1.0.0%22%2C%22ports%22%3A%5B%7B%22containerPort%22%3A8080%2C%22protocol%22%3A%22TCP%22%7D%5D%2C%22env%22%3A%5B%7B%22name%22%3A%22POD_NAME%22%2C%22valueFrom%22%3A%7B%22fieldRef%22%3A%7B%22apiVersion%22%3A%22v1%22%2C%22fieldPath%22%3A%22metadata.name%22%7D%7D%7D%2C%7B%22name%22%3A%22POD_NAMESPACE%22%2C%22valueFrom%22%3A%7B%22fieldRef%22%3A%7B%22apiVersion%22%3A%22v1%22%2C%22fieldPath%22%3A%22metadata.namespace%22%7D%7D%7D%5D%2C%22resources%22%3A%7B%7D%2C%22terminationMessagePath%22%3A%22%2Fdev%2Ftermination-log%22%2C%22terminationMessagePolicy%22%3A%22File%22%2C%22imagePullPolicy%22%3A%22IfNotPresent%22%7D%2C%22dnsPolicy%22%3A%22ClusterFirst%22%2C%22serviceAccountName%22%3A%22app-admin%22%2C%22serviceAccount%22%3A%22app-admin%22%2C%22securityContext%22%3A%7B%7D%2C%22schedulerName%22%3A%22default-scheduler%22%7D%7D%2C%22strategy%22%3A%7B%22type%22%3A%22RollingUpdate%22%2C%22rollingUpdate%22%3A%7B%22maxUnavailable%22%3A%2225%25%22%2C%22maxSurge%22%3A%2225%25%22%7D%7D%2C%22revisionHistoryLimit%22%3A10%2C%22progressDplicas%22%3A1%2C%22updatedReplicas%22%3A1%2C%22readyReplicas%22%3A1%2C%22availableReplicas%22%3A1%2C%22conditions%22%3A%5B%7B%22type%22%3A%22Available%22%2C%22status%22%3A%22True%22%2C%22lastUpdateTime%22%3A%222023-09-19T05%3A47%3A20Z%22%2C%22lastTransitionTime%22%3A%222023-09-19T05%3A47%3A20Z%22%2C%22reason%22%3A%22MinimumReplicasAvailable%22%2C%22message%22%3A%22Deployment+has+minimum+availability.%22%7D%2C%7BA%222023-09-19T05%3A49%3A20Z%22%2C%22lastTransitionTime%22%3A%222023-09-19T05%3A47%3A14Z%22%2C%22reason%22%3A%22NewReplicaSetAvailable%22%2C%22message%22%3A%22ReplicaSet+%5C%22as-test-deploy-67666f6b67%5C%22+has+successfully+progressed.%22%7D%5D%7D%7D%2C%22runType%22%3A%22scaler%22%7D": context deadline exceeded

lyudmilalala avatar Sep 19 '23 09:09 lyudmilalala

Hi @lyudmilalala, thanks for raising this.

If you are hitting one of your own endpoints that needs access to this resource information you can use a POST request instead (see docs here: https://custom-pod-autoscaler.readthedocs.io/en/latest/user-guide/methods/#post-example).

I think you are right though, some endpoints will not need this resource information at all (the example that you have been running is one of them, it just needs the random number, and it shouldn't be exposing deployment information to random.org). Let me look at adding in a new configuration option that skips including any resource information in hooks.

jthomperoo avatar Sep 19 '23 09:09 jthomperoo

Thanks for quick reply @jthomperoo .

According to the definition of RESTful APIs, GET represents fetching results, while POST is always used for creating or updating data. Therefore, I think people will use GET for obtaining the metrics more frequently, and thus it would be better to tell them that the whole deployment config is included in query. (I do take a while to figure out the problem.)

Also, want to quickly ask, if the metrics shell exit with status not equals to 0 because of error, will the evaluate script still be called, or the evaluation step will just be skipped?

Hope this project becomes better.

lyudmilalala avatar Sep 20 '23 02:09 lyudmilalala