helm-charts icon indicating copy to clipboard operation
helm-charts copied to clipboard

[grafana] Multi-Attach Error for Volume on Upgrade

Open Chancebair opened this issue 3 years ago • 3 comments

It seems that the issue (https://github.com/grafana/helm-charts/issues/146) is still not solved or I haven't configured my helm values correctly

Chart Version: 6.25.1 App Version: 8.4.2

Values

admin:
  existingSecret: ""
  passwordKey: admin-password
  userKey: admin-user
adminUser: admin
affinity: {}
appServer:
  autoscaling:
    enabled: false
  create: true
  replicaCount: 1
  resources: {}
autoscaling:
  enabled: false
containerSecurityContext: {}
cronJobs: {}
dashboardProviders: {}
dashboards: {}
dashboardsConfigMaps: {}
datasources: {}
deploymentStrategy:
  type: RollingUpdate
downloadDashboards:
  env: {}
  envFromSecret: ""
  resources: {}
downloadDashboardsImage:
  pullPolicy: IfNotPresent
  repository: curlimages/curl
  sha: ""
  tag: 7.73.0
enableKubeBackwardCompatibility: false
enableServiceLinks: true
env: {}
envFromConfigMaps: []
envFromSecret: ""
envFromSecrets: []
envRenderSecret: {}
envValueFrom: {}
extraConfigmapMounts: []
extraContainerVolumes: []
extraContainers: ""
extraEmptyDirMounts: []
extraExposePorts: []
extraInitContainers: []
extraLabels: {}
extraObjects: []
extraSecretMounts: []
extraVolumeMounts: []
fullnameOverride: ""
grafana.ini:
  analytics:
    check_for_updates: true
  auth.generic_oauth:
    allowed_domains: around.technology aroundhome.de
    api_url: https://auth.aroundhome.de/api/users/current
    auth_url: https://auth.aroundhome.de/oauth/authorize
    client_id: grafana-production
    client_secret: aa74bac78d33a861993cf8b397a06f31c114710c5b2a80b07d103c65e55ff8b7
    enabled: true
    name: Aroundhome Login
    scopes: public
    token_url: https://auth.aroundhome.de/oauth/token
  grafana_net:
    url: https://grafana.net
  log:
    mode: console
  paths:
    data: /var/lib/grafana/
    logs: /var/log/grafana
    plugins: /var/lib/grafana/plugins
    provisioning: /etc/grafana/provisioning
  server:
    root_url: https://grafana.prod-1.eks.aroundhome-production.de/
headlessService: false
hostAliases: []
image:
  env:
    APPLICATION_NAME: grafana
    AWS_REGION: eu-central-1
    ENVIRONMENT: production
    ON_AWS: true
    ON_EKS: true
  livenessProbe:
    path: /login
  port: 3000
  pullPolicy: Always
  readinessProbe:
    path: /login
  repository: grafana/grafana
  sha: ""
  tag: latest
imageRenderer:
  enabled: false
  env:
    HTTP_HOST: 0.0.0.0
  grafanaProtocol: http
  grafanaSubPath: ""
  hostAliases: []
  image:
    pullPolicy: Always
    repository: grafana/grafana-image-renderer
    sha: ""
    tag: latest
  networkPolicy:
    limitEgress: false
    limitIngress: true
  podPortName: http
  priorityClassName: ""
  replicas: 1
  resources: {}
  revisionHistoryLimit: 10
  securityContext: {}
  service:
    enabled: true
    port: 8081
    portName: http
    targetPort: 8081
  serviceAccountName: ""
ingress:
  annotations: {}
  enabled: false
  extraPaths: []
  hosts:
  - chart-example.local
  labels: {}
  path: /
  pathType: Prefix
  tls: []
initChownData:
  enabled: true
  image:
    pullPolicy: IfNotPresent
    repository: busybox
    sha: ""
    tag: 1.31.1
  resources: {}
jobs: {}
ldap:
  config: ""
  enabled: false
  existingSecret: ""
livenessProbe:
  failureThreshold: 10
  httpGet:
    path: /api/health
    port: 3000
  initialDelaySeconds: 60
  timeoutSeconds: 30
manualJob: false
nameOverride: ""
namespaceOverride: ""
networkPolicy:
  allowExternal: true
  enabled: false
  explicitNamespacesSelector: {}
nodeSelector: {}
notifiers: {}
persistence:
  accessModes:
  - ReadWriteOnce
  enabled: true
  finalizers:
  - kubernetes.io/pvc-protection
  inMemory:
    enabled: false
  size: 5Gi
  type: pvc
plugins: []
podDisruptionBudget:
  minAvailable: 1
podPortName: grafana
rbac:
  create: true
  extraClusterRoleRules: []
  extraRoleRules: []
  namespaced: false
  pspEnabled: true
  pspUseAppArmor: true
readinessProbe:
  httpGet:
    path: /api/health
    port: 3000
replicas: 2
resources: {}
resqueWorkers: {}
revisionHistoryLimit: 10
securityContext:
  fsGroup: 472
  runAsGroup: 472
  runAsUser: 472
service:
  annotations: {}
  domain: local
  enabled: true
  hostname: grafana
  labels: {}
  port: 80
  portName: service
  selector:
    app.kubernetes.io/app: grafana
  targetPort: 3000
  type: ClusterIP
serviceAccount:
  autoMount: true
  create: false
  name: null
  nameTest: null
serviceMonitor:
  enabled: false
  interval: 1m
  labels: {}
  path: /metrics
  relabelings: []
  scheme: http
  scrapeTimeout: 30s
  tlsConfig: {}
sidecar:
  dashboards:
    SCProvider: true
    defaultFolderName: null
    enabled: true
    extraMounts: []
    folder: /tmp/dashboards
    folderAnnotation: null
    label: grafana_dashboard
    labelValue: null
    provider:
      allowUiUpdates: false
      disableDelete: false
      folder: ""
      foldersFromFilesStructure: false
      name: sidecarProvider
      orgid: 1
      type: file
    resource: both
    script: null
    searchNamespace: null
    watchMethod: WATCH
  datasources:
    enabled: true
    initDatasources: false
    label: grafana_datasource
    labelValue: null
    reloadURL: http://localhost:3000/api/admin/provisioning/datasources/reload
    resource: both
    searchNamespace: null
    skipReload: false
    watchMethod: WATCH
  enableUniqueFilenames: false
  image:
    repository: quay.io/kiwigrid/k8s-sidecar
    sha: ""
    tag: 1.15.6
  imagePullPolicy: IfNotPresent
  notifiers:
    enabled: false
    label: grafana_notifier
    resource: both
    searchNamespace: null
  plugins:
    enabled: false
    initPlugins: false
    label: grafana_plugin
    labelValue: null
    reloadURL: http://localhost:3000/api/admin/provisioning/plugins/reload
    resource: both
    searchNamespace: null
    skipReload: false
    watchMethod: WATCH
  resources: {}
  securityContext: {}
smtp:
  existingSecret: ""
  passwordKey: password
  userKey: user
testFramework:
  enabled: true
  image: bats/bats
  imagePullPolicy: IfNotPresent
  securityContext: {}
  tag: v1.4.1
tolerations: []
workers: {}

Error

#  k describe pod grafana-744c7cfb56-sjvtw -n monitoring
Name:           grafana-744c7cfb56-sjvtw
Namespace:      monitoring
Priority:       0
Node:           ip-10-17-194-139.eu-central-1.compute.internal/10.17.194.139
Start Time:     Wed, 06 Apr 2022 11:36:35 +0200
Labels:         app.kubernetes.io/instance=grafana
                app.kubernetes.io/name=grafana
                pod-template-hash=744c7cfb56
                security.istio.io/tlsMode=istio
                service.istio.io/canonical-name=grafana
                service.istio.io/canonical-revision=latest
Annotations:    checksum/config: d0344cb3b5b65d25ac148de6f7ac5f19528f6493b716ee48022fe81994ceafab
                checksum/dashboards-json-config: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b
                checksum/sc-dashboard-provider-config: b6682a4a9c410aa4b93a8bd52627e36bdd3fbc897dad2097a43639e305ff7ecc
                checksum/secret: d04d01bef1414d5686ab2d448d975e964cf53d9c660d5e61d3dadaef2721a67f
                kubernetes.io/psp: eks.privileged
                prometheus.io/path: /stats/prometheus
                prometheus.io/port: 15020
                prometheus.io/scrape: true
                sidecar.istio.io/status:
                  {"initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["istio-envoy","istio-data","istio-podinfo","istio-token","istiod-...
Status:         Pending
IP:
IPs:            <none>
Controlled By:  ReplicaSet/grafana-744c7cfb56
Init Containers:
  init-chown-data:
    Container ID:
    Image:         busybox:1.31.1
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      chown
      -R
      472:472
      /var/lib/grafana
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/lib/grafana from storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kgls9 (ro)
  istio-init:
    Container ID:
    Image:         docker.io/istio/proxyv2:1.11.4
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Args:
      istio-iptables
      -p
      15001
      -z
      15006
      -u
      1337
      -m
      REDIRECT
      -i
      *
      -x

      -b
      *
      -d
      15090,15021,15020
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     2
      memory:  1Gi
    Requests:
      cpu:        100m
      memory:     128Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kgls9 (ro)
Containers:
  grafana-sc-dashboard:
    Container ID:
    Image:          quay.io/kiwigrid/k8s-sidecar:1.15.6
    Image ID:
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      METHOD:    WATCH
      LABEL:     grafana_dashboard
      FOLDER:    /tmp/dashboards
      RESOURCE:  both
    Mounts:
      /tmp/dashboards from sc-dashboard-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kgls9 (ro)
  grafana-sc-datasources:
    Container ID:
    Image:          quay.io/kiwigrid/k8s-sidecar:1.15.6
    Image ID:
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      METHOD:        WATCH
      LABEL:         grafana_datasource
      FOLDER:        /etc/grafana/provisioning/datasources
      RESOURCE:      both
      REQ_USERNAME:  <set to the key 'admin-user' in secret 'grafana'>      Optional: false
      REQ_PASSWORD:  <set to the key 'admin-password' in secret 'grafana'>  Optional: false
      REQ_URL:       http://localhost:3000/api/admin/provisioning/datasources/reload
      REQ_METHOD:    POST
    Mounts:
      /etc/grafana/provisioning/datasources from sc-datasources-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kgls9 (ro)
  grafana:
    Container ID:
    Image:          grafana/grafana:latest
    Image ID:
    Ports:          80/TCP, 3000/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Liveness:       http-get http://:15020/app-health/grafana/livez delay=60s timeout=30s period=10s #success=1 #failure=10
    Readiness:      http-get http://:15020/app-health/grafana/readyz delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      GF_SECURITY_ADMIN_USER:      <set to the key 'admin-user' in secret 'grafana'>      Optional: false
      GF_SECURITY_ADMIN_PASSWORD:  <set to the key 'admin-password' in secret 'grafana'>  Optional: false
      GF_PATHS_DATA:               /var/lib/grafana/
      GF_PATHS_LOGS:               /var/log/grafana
      GF_PATHS_PLUGINS:            /var/lib/grafana/plugins
      GF_PATHS_PROVISIONING:       /etc/grafana/provisioning
    Mounts:
      /etc/grafana/grafana.ini from config (rw,path="grafana.ini")
      /etc/grafana/provisioning/dashboards/sc-dashboardproviders.yaml from sc-dashboard-provider (rw,path="provider.yaml")
      /etc/grafana/provisioning/datasources from sc-datasources-volume (rw)
      /tmp/dashboards from sc-dashboard-volume (rw)
      /var/lib/grafana from storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kgls9 (ro)
  istio-proxy:
    Container ID:
    Image:         docker.io/istio/proxyv2:1.11.4
    Image ID:
    Port:          15090/TCP
    Host Port:     0/TCP
    Args:
      proxy
      sidecar
      --domain
      $(POD_NAMESPACE).svc.cluster.local
      --proxyLogLevel=warning
      --proxyComponentLogLevel=misc:error
      --log_output_level=default:info
      --concurrency
      2
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     2
      memory:  1Gi
    Requests:
      cpu:      100m
      memory:   128Mi
    Readiness:  http-get http://:15021/healthz/ready delay=1s timeout=3s period=2s #success=1 #failure=30
    Environment:
      JWT_POLICY:                    third-party-jwt
      PILOT_CERT_PROVIDER:           istiod
      CA_ADDR:                       istiod.istio-system.svc:15012
      POD_NAME:                      grafana-744c7cfb56-sjvtw (v1:metadata.name)
      POD_NAMESPACE:                 monitoring (v1:metadata.namespace)
      INSTANCE_IP:                    (v1:status.podIP)
      SERVICE_ACCOUNT:                (v1:spec.serviceAccountName)
      HOST_IP:                        (v1:status.hostIP)
      PROXY_CONFIG:                  {}

      ISTIO_META_POD_PORTS:          [
                                         {"name":"service","containerPort":80,"protocol":"TCP"}
                                         ,{"name":"grafana","containerPort":3000,"protocol":"TCP"}
                                     ]
      ISTIO_META_APP_CONTAINERS:     grafana-sc-dashboard,grafana-sc-datasources,grafana
      ISTIO_META_CLUSTER_ID:         Kubernetes
      ISTIO_META_INTERCEPTION_MODE:  REDIRECT
      ISTIO_META_WORKLOAD_NAME:      grafana
      ISTIO_META_OWNER:              kubernetes://apis/apps/v1/namespaces/monitoring/deployments/grafana
      ISTIO_META_MESH_ID:            cluster.local
      TRUST_DOMAIN:                  cluster.local
      ISTIO_KUBE_APP_PROBERS:        {"/app-health/grafana/livez":{"httpGet":{"path":"/api/health","port":3000,"scheme":"HTTP"},"timeoutSeconds":30},"/app-health/grafana/readyz":{"httpGet":{"path":"/api/health","port":3000,"scheme":"HTTP"},"timeoutSeconds":1}}
    Mounts:
      /etc/istio/pod from istio-podinfo (rw)
      /etc/istio/proxy from istio-envoy (rw)
      /var/lib/istio/data from istio-data (rw)
      /var/run/secrets/istio from istiod-ca-cert (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kgls9 (ro)
      /var/run/secrets/tokens from istio-token (rw)
Conditions:
  Type              Status
  Initialized       False
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  istio-envoy:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  <unset>
  istio-data:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  istio-podinfo:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.labels -> labels
      metadata.annotations -> annotations
  istio-token:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  43200
  istiod-ca-cert:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      istio-ca-root-cert
    Optional:  false
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      grafana
    Optional:  false
  storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  grafana
    ReadOnly:   false
  sc-dashboard-volume:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  sc-dashboard-provider:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      grafana-config-dashboards
    Optional:  false
  sc-datasources-volume:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  kube-api-access-kgls9:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason              Age   From                     Message
  ----     ------              ----  ----                     -------
  Normal   Scheduled           5m5s  default-scheduler        Successfully assigned monitoring/grafana-744c7cfb56-sjvtw to ip-10-17-194-139.eu-central-1.compute.internal
  Warning  FailedAttachVolume  5m5s  attachdetach-controller  Multi-Attach error for volume "pvc-650e0b2d-ed60-4e6a-b2ec-1bece52408fe" Volume is already used by pod(s) grafana-647dd799fd-8rtqp, grafana-647dd799fd-bztsz
  Warning  FailedMount         3m2s  kubelet                  Unable to attach or mount volumes: unmounted volumes=[storage], unattached volumes=[sc-datasources-volume istio-envoy istio-token istio-data sc-dashboard-provider config kube-api-access-kgls9 sc-dashboard-volume istiod-ca-cert istio-podinfo storage]: timed out waiting for the condition
  Warning  FailedMount         48s   kubelet                  Unable to attach or mount volumes: unmounted volumes=[storage], unattached volumes=[istio-envoy istiod-ca-cert sc-dashboard-volume sc-datasources-volume config storage istio-data kube-api-access-kgls9 sc-dashboard-provider istio-token istio-podinfo]: timed out waiting for the condition

Chancebair avatar Apr 06 '22 09:04 Chancebair

I have the same issue. Any updates on this?

alicyn avatar May 18 '22 18:05 alicyn

I have the same issue too

I think there should be a database here to replace the pvc. Can make grafana multiple instances

m1zzx2 avatar Aug 23 '22 10:08 m1zzx2

I faced the same issue, and found a manual workaround: I scaled down the deployment to 0, waited until final termination and then scaled up to 1

usulkies avatar Nov 27 '22 16:11 usulkies

Same issue. The workaround is deleting an existing ReplicaSet (causing downtime obviously)

zerodayyy avatar Dec 08 '22 14:12 zerodayyy

Hello, a potential workaround is to set

deploymentStrategy:
  type: Recreate

This does cause downtime, but at least doesn't require manual intervention.

Spittal avatar May 18 '23 16:05 Spittal

Hello, a potential workaround is to set

deploymentStrategy:
  type: Recreate

This does cause downtime, but at least doesn't require manual intervention.

Thanks, helped a lot! 🎉

It's tricky, because the configuration can't be found in the kube-prometheus-stack chart values, but in the dependend grafana chart values.

mscholze avatar Sep 28 '23 09:09 mscholze