custom-pod-autoscaler
custom-pod-autoscaler copied to clipboard
Metrics with type http brings the whole deployment config as query
Kubernetes Details (kubectl version
):
K8s in Docker on Windows 10 K8s version v1.27.2
Bug & Reproduce
I try the get metrics by http request example.
It works properly if everything are unchanged. However, if I add more features into the deployment.yaml
, make it as below
apiVersion: apps/v1
kind: Deployment
metadata:
name: as-test-deploy
namespace: as-test-ns
spec:
replicas: 1
selector:
matchLabels:
app: hello-kubernetes
template:
metadata:
labels:
app: hello-kubernetes
spec:
containers:
- name: hello-kubernetes
image: paulbouwer/hello-kubernetes:1.5
ports:
- containerPort: 8080
protocol: TCP
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
resources:
requests:
memory: "600Mi"
limits:
memory: "600Mi"
readinessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
CPA will give out the error
E0919 06:15:47.113456 1 main.go:277] HTTP request failed, status: [414], response: '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>414 Request-URI Too Long</title>
</head><body>
<h1>Request-URI Too Long</h1>
<p>The requested URL's length exceeds the capacity
limit for this server.<br />
</p>
<hr>
<address>Apache/2.4.25 (Debian) Server at w2a.random.org Port 443</address>
</body></html>
'
I think it is because the deployment configuration is appended as a query parameter of the request. This conjecture can be proved by another line of error log mentioned below.
Currently, for workaround, I write a python shell script, and send requests there.
Expected behavior
The deployment.yaml
used in production will probably be longer than the one I list above, so I think we should avoid to take it in query, or at least give users a choice not to send it by query.
Additional context - The other error log
If everything are unchanged, the following error will be printed after running CPA for around 10 minutes. I think it is caused by frequestly request to https://www.random.org/integers/, and won't regard it as a client-side issue.
E0919 05:49:39.365893 1 main.go:277] Get "https://www.random.org/integers/?base=10&col=1&format=plain&max=5&min=1&num=1&rnd=new&value=%7B%22resource%22%3A%7B%22kind%22%3A%22Deployment%22%2C%22apiVersion%22%3A%22apps%2Fv1%22%2C%22metadata%22%3A%7B%22name%22%3A%22as-test-deploy%22%2C%22namespace%22%3A%22as-test-ns%22%2C%22uid%22%3A%2231b899b6-978e-4cb6-89ec-a20d7a7a1651%22%2C%22resourceVersion%22%3A%222092030%22%2C%22generation%22%3A3%2C%22creationTimestamp%22%3A%222023-09-19T05%3A47%3A14Z%22%2C%22annotations%22%3A%7B%22deployment.kubernetes.io%2Frevision%22%3A%223%22%2C%22kubectl.kubernetes.io%2Flast-applied-configuration%22%3A%22%7B%5C%22apiVersion%5C%22%3A%5C%22apps%2Fv1%5C%22%2C%5C%22kind%5C%22%3A%5C%22Deployment%5C%22%2C%5C%22metadata%5C%22%3A%7B%5C%22annotations%5C%22%3A%7B%7D%2C%5C%22name%5C%22%3A%5C%22as-test-deploy%5C%22%2C%5C%22namespace%5C%22%3A%5C%22as-test-ns%5C%22%7D%2C%5C%22spec%5C%22%3A%7B%5C%22replicas%5C%22%3A1%2C%5C%22selector%5C%22%3A%7B%5C%22matchLabels%5C%22%3A%7B%5C%22app%5C%22%3A%5C%22as-test-app%5C%22%7D%7D%2C%5C%22template%5C%22%3A%7B%5C%22metadata%5C%22%3A%7B%5C%22annotations%5C%22%3A%7B%5C%22controller.kubernetes.io%2Fpod-deletion-cost%5C%22%3A%5C%221%5C%22%7D%2C%5C%22labels%5C%22%3A%7B%5C%22app%5C%22%3A%5C%22as-test-app%5C%22%7D%7D%2C%5C%22spec%5C%22%3A%7B%5C%22containers%5C%22%3A%5B%7B%5C%22env%5C%22%3A%5B%7B%5C%22name%5C%22%3A%5C%22POD_NAME%5C%22%2C%5C%22valueFrom%5C%22%3A%7B%5C%22fieldRef%5C%22%3A%7B%5C%22fieldPath%5C%22%3A%5C%22metadata.name%5C%22%7D%7D%7D%2C%7B%5C%22name%5C%22%3A%5C%22POD_NAMESPACE%5C%22%2C%5C%22valueFrom%5C%22%3A%7B%5C%22fieldRef%5C%22%3A%7B%5C%22fieldPath%5C%22%3A%5C%22metadata.namespace%5C%22%7D%7D%7D%5D%2C%5C%22image%5C%22%3A%5C%22lyudmilalala%2Fflask-test-img%3A1.0.0%5C%22%2C%5C%22imagePullPolicy%5C%22%3A%5C%22IfNotPresent%5C%22%2C%5C%22name%5C%22%3A%5C%22as-test-pod%5C%22%2C%5C%22ports%5C%22%3A%5B%7B%5C%22containerPort%5C%22%3A8080%7D%5D%7D%5D%2C%5C%22serviceAccountName%5C%22%3A%5C%22app-admin%5C%22%7D%7D%7D%7D%5Cn%22%7D%2C%22managedFields%22%3A%5B%7B%22manager%22%3A%22kubectl-client-side-apply%22%2C%22operation%22%3A%22Update%22%2C%22apiVersion%22%3A%22apps%2Fv1%22%2C%22time%22%3A%222023-09-19T05%3A49%3A19Z%22%2C%22fieldsType%22%3A%22FieldsV1%22%2C%22fieldsV1%22%3A%7B%22f%3Ametadata%22%3A%7B%22f%3Aannotations%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3Akubectl.kubernetes.io%2Flast-applied-configuration%22%3A%7B%7D%7D%7D%2C%22f%3Aspec%22%3A%7B%22f%3AprogressDeadlineSeconds%22%3A%7B%7D%2C%22f%3Areplicas%22%3A%7B%7D%2C%22f%3ArevisionHistoryLimit%22%3A%7B%7D%2C%22f%3Aselector%22%3A%7B%7D%2C%22f%3Astrategy%22%3A%7B%22f%3ArollingUpdate%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3AmaxSurge%22%3A%7B%7D%2C%22f%3AmaxUnavailable%22%3A%7B%7D%7D%2C%22f%3Atype%22%3A%7B%7D%7D%2C%22f%3Atemplate%22%3A%7B%22f%3Ametadata%22%3A%7B%22f%3Aannotations%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3Acontroller.kubernetes.io%2Fpod-deletion-cost%22%3A%7B%7D%7D%2C%22f%3Alabels%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3Aapp%22%3A%7B%7D%7D%7D%2C%22f%3Aspec%22%3A%7B%22f%3Acontainers%22%3A%7B%22k%3A%7B%5C%22name%5C%22%3A%5C%22as-test-pod%5C%22%7D%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3Aenv%22%3A%7B%22.%22%3A%7B%7D%2C%22k%3A%7B%5C%22name%5C%22%3A%5C%22POD_NAME%5C%22%7D%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3Aname%22%3A%7B%7D%2C%22f%3AvalueFrom%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3AfieldRef%22%3A%7B%7D%7D%7D%2C%22k%3A%7B%5C%22name%5C%22%3A%5C%22POD_NAMESPACE%5C%22%7D%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3Aname%22%3A%7B%7D%2C%22f%3AvalueFrom%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3AfieldRef%22%3A%7B%7D%7D%7D%7D%2C%22f%3Aimage%22%3A%7B%7D%2C%22f%3AimagePullPolicy%22%3A%7B%7D%2C%22f%3Aname%22%3A%7B%7D%2C%22f%3Aports%22%3A%7B%22.%22%3A%7B%7D%2C%22k%3A%7B%5C%22containerPort%5C%22%3A8080%2C%5C%22protocol%5C%22%3A%5C%22TCP%5C%22%7D%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3AcontainerPort%22%3A%7B%7D%2C%22f%3Aprotocol%22%3A%7B%7D%7D%7D%2C%22f%3Aresources%22%3A%7B%7D%2C%22f%3AterminationMessagePath%22%3A%7B%7D%2C%22f%3AterminationMessagePolicy%22%3A%7B%7D%7D%7D%2C%22f%3AdnsPolicy%22%3A%7B%7D%2C%22f%3ArestartPolicy%22%3A%7B%7D%2C%22f%3AschedulerName%22%3A%7B%7D%2C%22f%3AsecurityContext%22%3A%7B%7D%2C%22f%3AserviceAccount%22%3A%7B%7D%2C%22f%3AserviceAccountName%22%3A%7B%7D%2C%22f%3AterminationGracePeriodSeconds%22%3A%7B%7D%7D%7D%7D%7D%7D%2C%7B%22manager%22%3A%22kube-controller-manager%22%2C%22operation%22%3A%22Update%22%2C%22apiVersion%22%3A%22apps%2Fv1%22%2C%22time%22%3A%222023-09-19T05%3A49%3A20Z%22%2C%22fieldsType%22%3A%22FieldsV1%22%2C%22fieldsV1%22%3A%7B%22f%3Ametadata%22%3A%7B%22f%3Aannotations%22%3A%7B%22f%3Adeployment.kubernetes.io%2Frevision%22%3A%7B%7D%7D%7D%2C%22f%3Astatus%22%3A%7B%22f%3AavailableReplicas%22%3A%7B%7D%2C%22f%3Aconditions%22%3A%7B%22.%22%3A%7B%7D%2C%22k%3A%7B%5C%22type%5C%22%3A%5C%22Available%5C%22%7D%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3AlastTransitionTime%22%3A%7B%7D%2C%22f%3AlastUpdateTime%22%3A%7B%7D%2C%22f%3Amessage%22%3A%7B%7D%2C%22f%3Areason%22%3A%7B%7D%2C%22f%3Astatus%22%3A%7B%7D%2C%22f%3Atype%22%3A%7B%7D%7D%2C%22k%3A%7B%5C%22type%5C%22%3A%5C%22Progressing%5C%22%7D%22%3A%7B%22.%22%3A%7B%7D%2C%22f%3AlastTransitionTime%22%3A%7B%7D%2C%22f%3AlastUpdateTime%22%3A%7B%7D%2C%22f%3Amessage%22%3A%7B%7D%2C%22f%3Areason%22%3A%7B%7D%2C%22f%3Astatus%22%3A%7B%7D%2C%22f%3Atype%22%3A%7B%7D%7D%7D%2C%22f%3AobservedGeneration%22%3A%7B%7D%2C%22f%3AreadyReplicas%22%3A%7B%7D%2C%22f%3Areplicas%22%3A%7B%7D%2C%22f%3AupdatedReplicas%22%3A%7B%7D%7D%7D%7D%5D%7D%2C%22spec%22%3A%7B%22replicas%22%3A1%2C%22selector%22%3A%7B%22matchLabels%22%3A%7B%22app%22%3A%22as-test-app%22%7D%7D%2C%22template%22%3A%7B%22metadata%22%3A%7B%22creationTimestamp%22%3Anull%2C%22labels%22%3A%7B%22app%22%3A%22as-test-app%22%7D%2C%22annotations%22%3A%7B%22controller.kubernetes.io%2Fpod-deletion-cost%22%3A%221%22%7D%7D%2C%22spec%22%3A%7B%22containers%22%3A%5B%7B%22name%22%3A%22as-test-pod%22%2C%22image%22%3A%22lyudmilalala%2Fflask-test-img%3A1.0.0%22%2C%22ports%22%3A%5B%7B%22containerPort%22%3A8080%2C%22protocol%22%3A%22TCP%22%7D%5D%2C%22env%22%3A%5B%7B%22name%22%3A%22POD_NAME%22%2C%22valueFrom%22%3A%7B%22fieldRef%22%3A%7B%22apiVersion%22%3A%22v1%22%2C%22fieldPath%22%3A%22metadata.name%22%7D%7D%7D%2C%7B%22name%22%3A%22POD_NAMESPACE%22%2C%22valueFrom%22%3A%7B%22fieldRef%22%3A%7B%22apiVersion%22%3A%22v1%22%2C%22fieldPath%22%3A%22metadata.namespace%22%7D%7D%7D%5D%2C%22resources%22%3A%7B%7D%2C%22terminationMessagePath%22%3A%22%2Fdev%2Ftermination-log%22%2C%22terminationMessagePolicy%22%3A%22File%22%2C%22imagePullPolicy%22%3A%22IfNotPresent%22%7D%2C%22dnsPolicy%22%3A%22ClusterFirst%22%2C%22serviceAccountName%22%3A%22app-admin%22%2C%22serviceAccount%22%3A%22app-admin%22%2C%22securityContext%22%3A%7B%7D%2C%22schedulerName%22%3A%22default-scheduler%22%7D%7D%2C%22strategy%22%3A%7B%22type%22%3A%22RollingUpdate%22%2C%22rollingUpdate%22%3A%7B%22maxUnavailable%22%3A%2225%25%22%2C%22maxSurge%22%3A%2225%25%22%7D%7D%2C%22revisionHistoryLimit%22%3A10%2C%22progressDplicas%22%3A1%2C%22updatedReplicas%22%3A1%2C%22readyReplicas%22%3A1%2C%22availableReplicas%22%3A1%2C%22conditions%22%3A%5B%7B%22type%22%3A%22Available%22%2C%22status%22%3A%22True%22%2C%22lastUpdateTime%22%3A%222023-09-19T05%3A47%3A20Z%22%2C%22lastTransitionTime%22%3A%222023-09-19T05%3A47%3A20Z%22%2C%22reason%22%3A%22MinimumReplicasAvailable%22%2C%22message%22%3A%22Deployment+has+minimum+availability.%22%7D%2C%7BA%222023-09-19T05%3A49%3A20Z%22%2C%22lastTransitionTime%22%3A%222023-09-19T05%3A47%3A14Z%22%2C%22reason%22%3A%22NewReplicaSetAvailable%22%2C%22message%22%3A%22ReplicaSet+%5C%22as-test-deploy-67666f6b67%5C%22+has+successfully+progressed.%22%7D%5D%7D%7D%2C%22runType%22%3A%22scaler%22%7D": context deadline exceeded
Hi @lyudmilalala, thanks for raising this.
If you are hitting one of your own endpoints that needs access to this resource information you can use a POST
request instead (see docs here: https://custom-pod-autoscaler.readthedocs.io/en/latest/user-guide/methods/#post-example).
I think you are right though, some endpoints will not need this resource information at all (the example that you have been running is one of them, it just needs the random number, and it shouldn't be exposing deployment information to random.org). Let me look at adding in a new configuration option that skips including any resource information in hooks.
Thanks for quick reply @jthomperoo .
According to the definition of RESTful APIs, GET
represents fetching results, while POST
is always used for creating or updating data. Therefore, I think people will use GET
for obtaining the metrics more frequently, and thus it would be better to tell them that the whole deployment config is included in query. (I do take a while to figure out the problem.)
Also, want to quickly ask, if the metrics shell exit with status not equals to 0 because of error, will the evaluate script still be called, or the evaluation step will just be skipped?
Hope this project becomes better.