prometheus-engine icon indicating copy to clipboard operation
prometheus-engine copied to clipboard

GKE errors on .managedAlertmanager field when adding labels/annotations to OperatorConfig

Open bjakubski opened this issue 2 years ago • 4 comments

I'm using TF to add labels/annotations to OperatorConfig in order to allow it being managed by helm. I've tried kubernetes_labels/kubernetes_annotations resources in terraform kubernetes provider and to my surprise I got errors:

Error: .managedAlertmanager: field not declared in schema
with module.REDACTED.kubernetes_labels.operatorconfig-helm[0]
on .terraform/modules/REDACTED  in resource "kubernetes_labels" "operatorconfig-helm":

resource "kubernetes_labels" "operatorconfig-helm" {

I started filing issue with k8s provider, but looking at debug logs I can't actually see TF provider doing anything wrong. Here's an excerpt:

2022-12-12T13:53:50.816Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: ---[ REQUEST ]---------------------------------------
2022-12-12T13:53:50.816Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: PATCH /apis/monitoring.googleapis.com/v1/namespaces/gmp-public/operatorconfigs/config?fieldManager=Terraform&force=false HTTP/1.1
2022-12-12T13:53:50.816Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: Host: REDACTED
2022-12-12T13:53:50.816Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: User-Agent: HashiCorp/1.0 Terraform/1.3.6
2022-12-12T13:53:50.816Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: Content-Length: 240
2022-12-12T13:53:50.816Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: Accept: application/json
2022-12-12T13:53:50.817Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: Authorization: Bearer REDACTED...............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
2022-12-12T13:53:50.817Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: Content-Type: application/apply-patch+yaml
2022-12-12T13:53:50.817Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: Accept-Encoding: gzip
2022-12-12T13:53:50.817Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: 
2022-12-12T13:53:50.817Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: {
2022-12-12T13:53:50.817Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:  "apiVersion": "monitoring.googleapis.com/v1",
2022-12-12T13:53:50.817Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:  "kind": "OperatorConfig",
2022-12-12T13:53:50.817Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:  "metadata": {
2022-12-12T13:53:50.817Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:   "annotations": {
2022-12-12T13:53:50.817Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:    "meta.helm.sh/release-name": "managedprometheus-patch",
2022-12-12T13:53:50.817Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:    "meta.helm.sh/release-namespace": "gmp-public"
2022-12-12T13:53:50.817Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:   },
2022-12-12T13:53:50.817Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:   "name": "config",
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:   "namespace": "gmp-public"
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:  }
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: }
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: 
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: -----------------------------------------------------
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: 2022/12/12 13:53:50 [DEBUG] Kubernetes API Response Details:
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: ---[ RESPONSE ]--------------------------------------
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: HTTP/2.0 500 Internal Server Error
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: Content-Length: 143
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: Audit-Id: f0ae4be9-a8e8-4364-87bb-3407ee014fb4
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: Cache-Control: no-cache, private
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: Content-Type: application/json
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: Date: Mon, 12 Dec 2022 13:53:50 GMT
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: X-Kubernetes-Pf-Flowschema-Uid: 356b0161-cd59-4f1f-9170-86f13609e682
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: X-Kubernetes-Pf-Prioritylevel-Uid: 6a8d0aab-cd6e-42a6-9f89-fcde7204b463
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: 
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: {
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:  "kind": "Status",
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:  "apiVersion": "v1",
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:  "metadata": {},
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:  "status": "Failure",
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:  "message": ".managedAlertmanager: field not declared in schema",
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5:  "code": 500
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: }
2022-12-12T13:53:50.818Z [DEBUG] provider.terraform-provider-kubernetes_v2.16.1_x5: 

I'm not that familiar with k8s api workings, so I don't know what to make of it exactly. I know that:

  1. I'm not setting managedAlertmanager field and it is not sent in request
  2. I did not set managedAlertmanager (as we're using our own)
  3. If I manually delete whole .managedAlertmanager field from the OperatorConfig on the cluster then TF provider's request wil complete successfully

Let me know if this is something that can be improved from monitoring side or it looks like k8s/gke/tf issue to you.

bjakubski avatar Dec 12 '22 15:12 bjakubski

Hi @bjakubski,

Your cluster may not have the latest release which includes the .managedAlertManager field.

What does this command return?

kubectl get deploy gmp-operator -ngmp-system -ojsonpath="{.spec.template.metadata.annotations['components.gke.io/component-version']}"

If it's < 0.3.1, then your OperatorConfig is using the older spec.

pintohutch avatar Dec 15 '22 01:12 pintohutch

I'm on 0.3.1 and the .managedAlertManager field is present and set in OperatorConfig (although we didn't set it):

kubectl  get -ngmp-public operatorconfigs.monitoring.googleapis.com config -ojsonpath='{.managedAlertmanager}' 

returns

{"configSecret":{"key":"alertmanager.yaml","name":"alertmanager"}}%

In order for the API request shown in debug output to pass I have to actually remove .managedAlertmanager first from OperatorConfig

bjakubski avatar Dec 15 '22 08:12 bjakubski

FWIW this problem was only present in couple of GKE clusters where we were making changes (out of ~15). In clusters where it happened workoaround of:

kubectl edit operatorconfig

remove managedAlertmanager completelty and save

solved this issue. (managedAlertmanager was immediately reconciled but didn't interfee with patching after that).

bjakubski avatar Dec 22 '22 10:12 bjakubski

Hi @bjakubski,

Apologies for the delayed response.

So this .managedAlertmanager is actually set through the default value of {"configSecret":{"key":"alertmanager.yaml","name":"alertmanager"}} in our manifests (generated as a kubebuilder marker). This allows the rule-evaluator to talk to the managed alertmanager that comes with managed-collection by default with no work required.

If you change your TF resource to include the default value for .managedAlertmanager to match, I wonder if the terraform apply will work.

pintohutch avatar Jan 12 '23 23:01 pintohutch