postgres-operator
postgres-operator copied to clipboard
Metadata annotation "history" in postgres's endpoint is too long
Please, answer some short questions which should help us to understand your problem / question better?
-
Which image of the operator are you using? registry.opensource.zalan.do/acid/postgres-operator:v1.8.0
-
Where do you run it - cloud or metal? Kubernetes or OpenShift? [AWS K8s | GCP ... | Bare Metal K8s] Bare Metal K8S
-
Are you running Postgres Operator in production?
-
yes
-
Type of issue? [Bug report, question, feature request, etc.]
2022-07-14 20:15:56,317 ERROR: Unexpected error from Kubernetes API
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py", line 483, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py", line 877, in patch_or_create
return self._patch_or_create(name, annotations, resource_version, patch, retry, ips)
File "/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py", line 868, in _patch_or_create
ret = retry(func, self._namespace, body) if retry else func(self._namespace, body)
File "/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py", line 468, in wrapper
return getattr(self._core_v1_api, func)(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py", line 404, in wrapper
return self._api_client.call_api(method, path, headers, body, **kwargs)
File "/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py", line 373, in call_api
return self._handle_server_response(response, _preload_content)
File "/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py", line 203, in _handle_server_response
raise k8s_client.rest.ApiException(http_resp=response)
patroni.dcs.kubernetes.K8sClient.rest.ApiException: (422)
Reason: Unprocessable Entity
HTTP response headers: HTTPHeaderDict({'Audit-Id': '36d5bda8-8944-497c-9559-f101567e2bcf', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '02dcfdea
-2f55-4b2a-a6c2-99f41f1ab800', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'fe9a36c3-4d6a-41b7-97b0-3f6346defa3e', 'Date': 'Thu, 14 Jul 2022 20:15:56 GMT', 'Content-Length': '761'})
HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Endpoints \\"postgres-postgres-config\\" is invalid: metadata.annotations: Too long: must have at most 2621
44 bytes","reason":"Invalid","details":{"name":"postgres-postgres-config","kind":"Endpoints","causes":[{"reason":"FieldValueTooLong","message":"Too long: must have at most 262144 bytes","field":"metadata.anno
tations"},{"reason":"FieldValueTooLong","message":"Too long: must have at most 262144 bytes","field":"metadata.annotations"},{"reason":"FieldValueTooLong","message":"Too long: must have at most 262144 bytes",
"field":"metadata.annotations"},{"reason":"FieldValueTooLong","message":"Too long: must have at most 262144 bytes","field":"metadata.annotations"}]},"code":422}\n'
k -n <> get ep postgres-postgres-config
apiVersion: v1
kind: Endpoints
metadata:
annotations:
config: '{"loop_wait":10,"maximum_lag_on_failover":33554432,"postgresql":{"parameters":{"archive_mode":"on","archive_timeout":"1800s","autovacuum_analyze_scale_factor":0.02,"autovacuum_max_workers":5,"au>
[%p]: [%l-1] %c %x %d %u %a %h ","log_lock_waits":"on","log_min_duration_statement":500,"log_statement":"ddl","log_temp_files":0,"max_connections":"100","max_replication_slots":10,"max_wal_senders":10,>
all all trust","host all all 127.0.0.1/32 md5","host all all ::1/128 md5","hostssl
replication standby all md5","hostssl all +zalandos all pam","hostssl all all
all md5","hostnossl all all all md5"]}'
history: '[[1,100663456,"no recovery target specified","2021-02-11T19:54:09+00:00"],[2,117440672,"no
recovery target specified"],[3,2030061304,"no recovery target specified","2021-02-20T17:17:52+00:00"],[4,8455716864,"no
recovery target specified","2021-02-28T18:09:33+00:00"],[5,22498246816,"no recovery
target specified","2021-03-18T04:29:35+00:00"],[6,22515024032,"no recovery target
specified","2021-03-18T04:31:26+00:00"],[7,22531801248,"no recovery target specified","2021-03-18T04:55:43+00:00"],[8,27095204000,"no
recovery target specified","2021-03-23T20:47:12+00:00"],[9,27128758432,"no recovery
target specified","2021-03-23T21:46:39+00:00"],[10,29544677536,"no recovery
target specified","2021-03-26T21:36:58+00:00"],[11,29561454752,"no recovery
target specified","2021-03-26T21:38:40+00:00"],[12,29578231968,"no recovery
target specified","2021-03-26T21:54:36+00:00"],[13,29578231968,"no recovery
target specified","2021-03-26T21:55:18+00:00"],[14,29611786400,"no recovery
target specified"],[15,29628563616,"no recovery target specified","2021-03-26T22:17:30+00:00"],[16,29645340832,"no
recovery target specified","2021-03-26T22:28:14+00:00"],[17,43504541880,"no
recovery target specified","2021-04-13T03:56:18+00:00"],[18,43509265984,"no
recovery target specified","2021-04-13T04:28:02+00:00"],[19,43512271984,"no
recovery target specified","2021-04-13T04:37:40+00:00"],[20,43514796584,"no
recovery target specified","2021-04-13T04:50:35+00:00"],[21,43520405216,"no
recovery target specified","2021-04-13T05:30:00+00:00"],[22,43522166616,"no
recovery target specified","2021-04-13T05:44:34+00:00"],[23,43525341240,"no
recovery target specified","2021-04-13T06:02:48+00:00"],[24,43528604672,"no
recovery target specified","2021-04-13T06:09:21+00:00"],[25,43536875680,"no
recovery target specified","2021-04-13T06:18:51+00:00"],[26,50717524128,"no
recovery target specified","2021-04-22T03:51:55+00:00"],[27,50734301344,"no
recovery target specified","2021-04-22T04:05:34+00:00"],[28,50818187424,"no
recovery target specified","2021-04-22T06:13:53+00:00"],[29,54626615456,"no
recovery target specified","2021-04-26T23:19:57+00:00"],[30,73316434080,"no
recovery target specified","2021-05-20T04:01:45+00:00"],[31,73317607192,"no
recovery target specified","2021-05-20T04:14:58+00:00"],[32,101502156960,"no
recovery target specified","2021-06-24T03:40:38+00:00"],[33,101518934176,"no
...
Oh, it seems that you have a lot of failovers... It is possible to reduce the number of history lines stored in the annotation by using max_timelines_history parameter.
i have the same issue. tried to set max_timelines_history: 10 in patronictl edit-config. restarted all db pods, even restarted postgres operator pod, delete endpoint config acid-prod-api
also tried to add to database yaml cofig
postgresql:
parameters:
max_timelines_history: "10"
but still getting this error
ERROR: Unexpected error from Kubernetes API
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 483, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 877, in patch_or_create
return self._patch_or_create(name, annotations, resource_version, patch, retry, ips)
File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 868, in _patch_or_create
ret = retry(func, self._namespace, body) if retry else func(self._namespace, body)
File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 468, in wrapper
return getattr(self._core_v1_api, func)(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 404, in wrapper
return self._api_client.call_api(method, path, headers, body, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 373, in call_api
return self._handle_server_response(response, _preload_content)
File "/usr/local/lib/python3.6/dist-packages/patroni/dcs/kubernetes.py", line 203, in _handle_server_response
raise k8s_client.rest.ApiException(http_resp=response)
patroni.dcs.kubernetes.K8sClient.rest.ApiException: (422)
Reason: Unprocessable Entity
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'c8949d3d-9984-421a-ad33-3a62c453fd6c', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': 'ca267a38-8ca9-4f84-8fd7-25684e895f05', 'X-Kubernetes-Pf-Prioritylevel-Uid': '80ccb9a2-95f6-47fb-bb00-dd34f2f81d54', 'Date': 'Mon, 18 Jul 2022 09:55:39 GMT', 'Content-Length': '753'})
HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Endpoints \\"acid-prod-api-config\\" is invalid: metadata.annotations: Too long: must have at most 262144 bytes","reason":"Invalid","details":{"name":"acid-prod-api-config","kind":"Endpoints","causes":[{"reason":"FieldValueTooLong","message":"Too long: must have at most 262144 bytes","field":"metadata.annotations"},{"reason":"FieldValueTooLong","message":"Too long: must have at most 262144 bytes","field":"metadata.annotations"},{"reason":"FieldValueTooLong","message":"Too long: must have at most 262144 bytes","field":"metadata.annotations"},{"reason":"FieldValueTooLong","message":"Too long: must have at most 262144 bytes","field":"metadata.annotations"}]},"code":422}\n'
looks like it is cached somewhere and nothing helps. do you have an idea where can i clean it?
looks like it is cached somewhere and nothing helps. do you have an idea where can i clean it? Hi! I fixed it manually: kubectl exec into pod and run "patronictl edit-config" "maximum_lag_on_failover" belongs patroni layer (config) not postgres. List of parameters in manifest is limited patroni-parameters From here Good luck!
patronictl edit-config
thank you very much. it helped
[SOLVED]
Hi guys thanks for the help!
We are also running the postgres operator and we had the same exception being thrown.
We followed the steps you provided:
- manually shortened endpoint metadata via
kubectl -n tefde-bmi-ci-infra edit ep postgres-hive-metastore-config -o yaml(This enabled changing the config via patronictl) - changed the config via patronictl (this was initially not possible because we were getting the same metadata-too-long exception), adding the max_timelines_history: 10
But still we are getting the same error: metadata.annotations: Too long: must have at most 262144 bytes even though the file is quite small now less than 1000 bytes. We tried uninstalling the operator and installing it back again but it did not solve the issue.
We seem to have the same "cached somewhere problem" as @wasap. We read the last comment from @vbortnikov but we could not understand if we had to set maximum_lag_failover and what value in case.
Any other suggestions?
We were putting the max_timelines_history parameter in the wrong place it goes in the outer part of the config:
max_timelines_history: 10
maximum_lag_on_failover: 33554432
postgresql:
parameters:
archive_mode: 'on'
archive_timeout: 1800s
autovacuum_analyze_scale_factor: 0.02
We were wondering were the config is cached cause we modified manually in k8s but this was not enough.
Happy coding
[SOLVED]
Hi guys thanks for the help!
We are also running the postgres operator and we had the same exception being thrown.
We followed the steps you provided:
- manually shortened endpoint metadata via
kubectl -n tefde-bmi-ci-infra edit ep postgres-hive-metastore-config -o yaml(This enabled changing the config via patronictl)- changed the config via patronictl (this was initially not possible because we were getting the same metadata-too-long exception), adding the max_timelines_history: 10
But still we are getting the same error: metadata.annotations: Too long: must have at most 262144 bytes even though the file is quite small now less than 1000 bytes. We tried uninstalling the operator and installing it back again but it did not solve the issue.
We seem to have the same "cached somewhere problem" as @wasap. We read the last comment from @vbortnikov but we could not understand if we had to set maximum_lag_failover and what value in case.
Any other suggestions?
We were putting the max_timelines_history parameter in the wrong place it goes in the outer part of the config:
max_timelines_history: 10 maximum_lag_on_failover: 33554432 postgresql: parameters: archive_mode: 'on' archive_timeout: 1800s autovacuum_analyze_scale_factor: 0.02We were wondering were the config is cached cause we modified manually in k8s but this was not enough.
Happy coding
Connect to each pod with kubectl exec ...
run patronictl edit-config and add there max_timelines_history: 10
We are facing same issue in our kubernetes deployment.
Could this max_timelines_history parameter be included to postgres manifest? Currently we are able to set only these. However having the max_timelines_history set to some specific value, or if default has finite value, we would prevent issues with large manifests: {"reason":"FieldValueTooLong","message":"Too long: must have at most 262144 bytes","field":"metadata.annotations"}