Document Postgres Deployment Error

Open osterman opened this issue 7 years ago • 0 comments

what

Containers:
  pr-1620-app-pg-postgresql:
    Container ID:
    Image:          r.cfcr.io/org/postgres:0.2.1
    Image ID:
    Port:           5432/TCP
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:      100m
      memory:   256Mi
    Liveness:   exec [sh -c exec pg_isready --host $POD_IP] delay=120s timeout=5s period=10s #success=1 #failure=6
    Readiness:  exec [sh -c exec pg_isready --host $POD_IP] delay=5s timeout=3s period=5s #success=1 #failure=3
    Environment:
      POSTGRES_USER:         app
      PGUSER:                app
      POSTGRES_DB:           app_dev
      POSTGRES_INITDB_ARGS:
      PGDATA:                /var/lib/postgresql/data/pgdata
      POSTGRES_PASSWORD:     <set to the key 'postgres-password' in secret 'pr-1620-app-pg-postgresql'>  Optional: false
      POD_IP:                 (v1:status.podIP)
    Mounts:
      /var/lib/postgresql/data/pgdata from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-pxtjf (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          False
  PodScheduled   True
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pr-1620-app-pg-postgresql
    ReadOnly:   false
  default-token-pxtjf:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-pxtjf
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.alpha.kubernetes.io/notReady:NoExecute for 300s
                 node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason       Age                  From                                                   Message
  ----     ------       ----                 ----                                                   -------
  Warning  FailedMount  38m (x449 over 17h)  kubelet, ip-172-20-110-115.us-west-2.compute.internal  Unable to mount volumes for pod "pr-1620-app-pg-postgresql-6644ff846f-m8hd5_pr-1620-app(6a9da756-693d-11e8-a064-0a82e2db714c)": timeout expired waiting for volumes to attach
/mount for pod "pr-1620-app"/"pr-1620-app-pg-postgresql-6644ff846f-m8hd5". list of unattached/unmounted volumes=[data]
  Warning  FailedSync   4m (x464 over 17h)   kubelet, ip-172-20-110-115.us-west-2.compute.internal  Error syncing pod
  Warning  FailedMount  3m (x492 over 16h)   attachdetach                                           (combined from similar events): AttachVolume.Attach failed for volume "pvc-0107630e-6890-11e8-a064-0a82e2db714c" : Error attaching EBS volume "vol-04fadeef55093f4f5" to instance "i-07cf7f0fbf
3fa1bc0": "VolumeInUse: vol-14fadeef55093f4f5 is already attached to an instance\n\tstatus code: 400, request id: 0f6fe03c-e7e4-4872-8c03-1c40ac8f0d82"

why

The problem was caused by a the postgres helm chart using type: "RollingUpdate" by default together with a Deployment resource. The problem was exasperated by using helm --recreate-pods.

The quick fix is to drop the --recreate-pods argument. The better fix is to change to type: Recreate or move to a StatefulSet

references

Jun 07 '18 05:06 osterman