cockroach-operator icon indicating copy to clipboard operation
cockroach-operator copied to clipboard

unable to start new cluster with existing persistent volumes

Open taroface opened this issue 4 years ago • 3 comments

After deleting a cluster by running kubectl delete -f example.yaml, and not deleting PVCs and PVs, I start a new cluster with kubectl apply -f example.yaml. The pod cockroachdb-0 remains unready with no other pods creating:

cockroachdb-0                         0/1     Running   0          2m17s

The pod shows:

Events:
  Type     Reason                  Age                From                                                 Message
  ----     ------                  ----               ----                                                 -------
  Normal   Scheduled               2m23s              default-scheduler                                    Successfully assigned default/cockroachdb-0 to gke-cockroachdb-default-pool-01b609ce-vkqv
  Normal   SuccessfulAttachVolume  2m17s              attachdetach-controller                              AttachVolume.Attach succeeded for volume "pvc-2c986ce6-96b8-42bf-80c9-10d4e39ace2c"
  Normal   Pulled                  2m14s              kubelet, gke-cockroachdb-default-pool-01b609ce-vkqv  Container image "cockroachdb/cockroach:v20.2.8" already present on machine
  Normal   Created                 2m14s              kubelet, gke-cockroachdb-default-pool-01b609ce-vkqv  Created container db
  Normal   Started                 2m14s              kubelet, gke-cockroachdb-default-pool-01b609ce-vkqv  Started container db
  Warning  Unhealthy               15s (x22 over 2m)  kubelet, gke-cockroachdb-default-pool-01b609ce-vkqv  Readiness probe failed: HTTP probe failed with statuscode: 500

I'm only able to create the new cluster if I first delete the volumes with kubectl delete pv,pvc --all.

This is using the Operator version currently at https://github.com/cockroachdb/cockroach-operator/blob/master/manifests/operator.yaml on GKE and CRDB v20.2.8.

taroface avatar May 14 '21 21:05 taroface

I thought this behavior was expected. I think this person on Community CockroachDB slack may be experiencing this issue as well (Slack conversation).

@keith-mcclellan , can you verify?

johnrk-zz avatar May 14 '21 21:05 johnrk-zz

This is a known issue and is fixed in https://github.com/cockroachdb/cockroach-operator/pull/477 - its set as a release blocker so it should be fixed when that is merged.

keith-mcclellan avatar May 17 '21 19:05 keith-mcclellan

The PR changed, but I am not certain my PR will fix this

chrislovecnm avatar May 19 '21 14:05 chrislovecnm