crossplane-provider-grafana icon indicating copy to clipboard operation
crossplane-provider-grafana copied to clipboard

Existing AccessPolicyTokens gets deleted and stuck on provider-grafana pod restart

Open DMarby opened this issue 1 year ago • 2 comments

When the provider-grafana pod is restarted, any AccessPolicyToken resources that are present, deletes any existing tokens, and then gets stuck attempting to recreate them.

Running Crossplane version 1.16.0, and provider-grafana 1.8.0.

Provider logs:

2024/09/06 12:08:09 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:08:09 [DEBUG] DELETE https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:08:10 [DEBUG] POST https://grafana.com/api/v1/tokens?region=
2024/09/06 12:08:11 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:08:13 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:08:17 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:08:25 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:08:42 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:09:14 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:10:14 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us
2024/09/06 12:11:14 [DEBUG] GET https://grafana.com/api/v1/tokens/<token-id>?region=us

Events from the AccessPolicyToken resource:

Events:
  Type     Reason                        Age    From                                                                  Message
  ----     ------                        ----   ----                                                                  -------
  Warning  CannotUpdateExternalResource  6m54s  managed/cloud.grafana.crossplane.io/v1alpha1, kind=accesspolicytoken  failed to
 update the resource: [{0 409 Conflict 409 Conflict
{
  "code": "InvalidArgument",
  "message": "Field is required: region",
  "requestId": "9024b1f2-e154-4d74-a647-3a21182e8219"
} []}]
  Warning  CannotObserveExternalResource  49s (x11 over 6m53s)  managed/cloud.grafana.crossplane.io/v1alpha1, kind=accesspolicy
token  failed to observe the resource: [{0 error reading policy token with ID`us:d37c1ccd-8f60-4ae6-883f-33971463637a`: 404 Not
 Found  []}]

Example existing AccessPolicyToken object (as part of a composition):

      name: grafana-api-key
      base:
        apiVersion: cloud.grafana.crossplane.io/v1alpha1
        kind: AccessPolicyToken
        spec:
          forProvider:
            region: us
            name: foo
            displayName: foo
            accessPolicyId: foo
            providerConfigRef:
              name: default
          writeConnectionSecretToRef:
            name: foo
            namespace: crossplane-system

This was seemingly introduced by https://github.com/grafana/crossplane-provider-grafana/pull/135

DMarby avatar Sep 06 '24 12:09 DMarby

We're seeing this internally too, see https://github.com/grafana/terraform-provider-grafana/pull/1886 in an attempt to fix this.

Duologic avatar Nov 05 '24 09:11 Duologic

With the latest version I'm still seeing (see the second log line):

2025/01/22 19:17:42 [DEBUG] DELETE https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:17:42 [DEBUG] POST https://grafana.com/api/v1/tokens?region=
2025/01/22 19:17:44 [DEBUG] GET https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:17:49 [DEBUG] GET https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:17:57 [DEBUG] GET https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:18:14 [DEBUG] GET https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:18:14 [DEBUG] POST https://grafana.com/api/v1/tokens?region=prod-eu-west-2
2025/01/22 19:18:14 [DEBUG] GET https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:18:15 [DEBUG] GET https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:18:47 [DEBUG] GET https://grafana.com/api/v1/tokens/51fGUID3b?region=prod-eu-west-2
2025/01/22 19:18:47 [DEBUG] POST https://grafana.com/api/v1/tokens?region=prod-eu-west-2
2025/01/22 19:18:48 [DEBUG] GET https://grafana.com/api/v1/tokens/bf879GUID5997?region=prod-eu-west-2
2025/01/22 19:18:48 [DEBUG] GET https://grafana.com/api/v1/tokens/bf879GUID5997?region=prod-eu-west-2
2025/01/22 19:18:49 [DEBUG] GET https://grafana.com/api/v1/tokens/bf879GUID5997?region=prod-eu-west-2
2025/01/22 19:20:43 [DEBUG] GET https://grafana.com/api/v1/tokens/bf879GUID5997?region=prod-eu-west-2
2025/01/22 19:20:44 [DEBUG] GET https://grafana.com/api/v1/tokens/bf879GUID5997?region=prod-eu-west-2
apiVersion: cloud.grafana.crossplane.io/v1alpha1
kind: AccessPolicyToken
metadata:
  labels:
    testing.upbound.io/example-name: test
  name: prometheus-access-policy-token
spec:
  providerConfigRef:
    name: grafana-cloud-provider
  forProvider:
    accessPolicySelector:
      matchLabels:
        test.it/grafana-access-policy: prometheus
    displayName: Prometheus Access Policy Token
    name: prometheus-access-policy-token
    region: prod-eu-west-2
  writeConnectionSecretToRef:
    name: prometheus-access-policy-token
    namespace: grafana-cloud
---
apiVersion: cloud.grafana.crossplane.io/v1alpha1
kind: AccessPolicy
metadata:
  labels:
    test.it/grafana-access-policy: prometheus
  name: prometheus-access-policy
spec:
  providerConfigRef:
    name: grafana-cloud-provider
  forProvider:
    displayName: Prometheus Access Policy
    name: prometheus-access-policy
    realm:
    - identifier: "0000000" # Changed for github post
      type: stack
    region: prod-eu-west-2
    scopes:
      - logs:write

The Loki pods give:

ts=2025-01-22T18:49:04.582609283Z level=error msg="final error sending batch" component_path=/ component_id=loki.write.hostedlogs component=client host=logs-prod-012.grafana.net status=401 tenant="" error="server returned HTTP status 401 Unauthorized (401): {"status":"error","error":"authentication error: legacy auth cannot be upgraded because the host is not found"}"

fe-ax avatar Jan 22 '25 19:01 fe-ax

I'm going to revert https://github.com/grafana/crossplane-provider-grafana/pull/135 as it causes a new bug instead of solving it.

Duologic avatar Sep 08 '25 08:09 Duologic