gcs
gcs copied to clipboard
GD2 pod automatically reboots.
Steps performed:-
- Created the GCS cluster:-
[vagrant@kube1 ~]$ kubectl get pods -n gcs
NAME READY STATUS RESTARTS AGE
csi-attacher-glusterfsplugin-0 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-45snb 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-pgp2w 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-s8g76 2/2 Running 0 2d21h
csi-provisioner-glusterfsplugin-0 2/2 Running 0 2d21h
etcd-4m7wv5fqk2 1/1 Running 0 2d21h
etcd-6mf2nsl2p4 1/1 Running 0 2d21h
etcd-lbmh9xjxm8 1/1 Running 0 2d21h
etcd-operator-7cb5bd459b-tddxt 1/1 Running 0 2d21h
gluster-kube1-0 1/1 Running 1 2d21h
gluster-kube2-0 1/1 Running 0 2d21h
gluster-kube3-0 1/1 Running 0 2d21h
- deleted the gd2 pod
[vagrant@kube1 ~]$
[vagrant@kube1 ~]$ kubectl delete pods -n gcs gluster-kube1-0
pod "gluster-kube1-0" deleted
[vagrant@kube1 ~]$
[vagrant@kube1 ~]$ kubectl get pods -n gcs
NAME READY STATUS RESTARTS AGE
csi-attacher-glusterfsplugin-0 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-45snb 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-pgp2w 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-s8g76 2/2 Running 0 2d21h
csi-provisioner-glusterfsplugin-0 2/2 Running 0 2d21h
etcd-4m7wv5fqk2 1/1 Running 0 2d21h
etcd-6mf2nsl2p4 1/1 Running 0 2d21h
etcd-lbmh9xjxm8 1/1 Running 0 2d21h
etcd-operator-7cb5bd459b-tddxt 1/1 Running 0 2d21h
gluster-kube1-0 0/1 ContainerCreating 0 5s
gluster-kube2-0 1/1 Running 0 2d21h
gluster-kube3-0 1/1 Running 0 2d21h
[vagrant@kube1 ~]$
[vagrant@kube1 ~]$
[vagrant@kube1 ~]$ kubectl get pods -n gcs
NAME READY STATUS RESTARTS AGE
csi-attacher-glusterfsplugin-0 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-45snb 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-pgp2w 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-s8g76 2/2 Running 0 2d21h
csi-provisioner-glusterfsplugin-0 2/2 Running 0 2d21h
etcd-4m7wv5fqk2 1/1 Running 0 2d21h
etcd-6mf2nsl2p4 1/1 Running 0 2d21h
etcd-lbmh9xjxm8 1/1 Running 0 2d21h
etcd-operator-7cb5bd459b-tddxt 1/1 Running 0 2d21h
gluster-kube1-0 1/1 Running 0 43s
gluster-kube2-0 1/1 Running 0 2d21h
gluster-kube3-0 1/1 Running 0 2d21h
[vagrant@kube1 ~]$
- executed commands on gd2 pod by log in by using same end points of kube1:-
command terminated with exit code 1
[vagrant@kube1 ~]$ kubectl get pods -n gcs -w
NAME READY STATUS RESTARTS AGE
csi-attacher-glusterfsplugin-0 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-45snb 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-pgp2w 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-s8g76 2/2 Running 0 2d21h
csi-provisioner-glusterfsplugin-0 2/2 Running 0 2d21h
etcd-4m7wv5fqk2 1/1 Running 0 2d21h
etcd-6mf2nsl2p4 1/1 Running 0 2d21h
etcd-lbmh9xjxm8 1/1 Running 0 2d21h
etcd-operator-7cb5bd459b-tddxt 1/1 Running 0 2d21h
gluster-kube1-0 1/1 Running 0 2m52s
gluster-kube2-0 1/1 Running 0 2d21h
gluster-kube3-0 1/1 Running 0 2d21h
[vagrant@kube1 ~]$
[vagrant@kube1 ~]$ kubectl -n gcs -it exec gluster-kube1-0 -- /bin/bash
[root@gluster-kube1-0 /]#
[root@gluster-kube1-0 /]#
[root@gluster-kube1-0 /]#
[root@gluster-kube1-0 /]# glustercli peer list --endpoints="http://gluster-kube1-0.glusterd2.gcs:24007"
Failed to get Peers list
Failed to connect to glusterd. Please check if
- Glusterd is running(http://gluster-kube1-0.glusterd2.gcs:24007) and reachable from this node.
- Make sure Endpoints specified in the command is valid
[root@gluster-kube1-0 /]#
[root@gluster-kube1-0 /]#
[root@gluster-kube1-0 /]#
[root@gluster-kube1-0 /]#
[root@gluster-kube1-0 /]# glustercli volume list --endpoints="http://gluster-kube1-0.glusterd2.gcs:24007"
Error getting volumes list
Failed to connect to glusterd. Please check if
- Glusterd is running(http://gluster-kube1-0.glusterd2.gcs:24007) and reachable from this node.
- Make sure Endpoints specified in the command is valid
[root@gluster-kube1-0 /]#
[root@gluster-kube1-0 /]#
- now executed commands using kube2 and kube 3 as endpoints. logged in from same kube1 pod.
[root@gluster-kube1-0 /]# glustercli volume list --endpoints="http://gluster-kube2-0.glusterd2.gcs:24007"
+--------------------------------------+----------------------+-----------+---------+-----------+--------+
| ID | NAME | TYPE | STATE | TRANSPORT | BRICKS |
+--------------------------------------+----------------------+-----------+---------+-----------+--------+
| 6cd58524-5172-4d9e-89ae-414bc338eba6 | pvc-f603ac47dcdc11e8 | Replicate | Started | tcp | 3 |
+--------------------------------------+----------------------+-----------+---------+-----------+--------+
[root@gluster-kube1-0 /]#
[root@gluster-kube1-0 /]# glustercli volume list --endpoints="http://gluster-kube3-0.glusterd2.gcs:24007"
+--------------------------------------+----------------------+-----------+---------+-----------+--------+
| ID | NAME | TYPE | STATE | TRANSPORT | BRICKS |
+--------------------------------------+----------------------+-----------+---------+-----------+--------+
| 6cd58524-5172-4d9e-89ae-414bc338eba6 | pvc-f603ac47dcdc11e8 | Replicate | Started | tcp | 3 |
+--------------------------------------+----------------------+-----------+---------+-----------+--------+
[root@gluster-kube1-0 /]#
- while excuting more commands the pods automatically restarts:-
[root@gluster-kube1-0 /]#
[root@gluster-kube1-0 /]# glustercli volume list -command terminated with exit code 137usterd2.gcs:24007"
[vagrant@kube1 ~]$
[vagrant@kube1 ~]$
[vagrant@kube1 ~]$ kubectl get pods -n gcs -w
NAME READY STATUS RESTARTS AGE
csi-attacher-glusterfsplugin-0 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-45snb 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-pgp2w 2/2 Running 0 2d21h
csi-nodeplugin-glusterfsplugin-s8g76 2/2 Running 0 2d21h
csi-provisioner-glusterfsplugin-0 2/2 Running 0 2d21h
etcd-4m7wv5fqk2 1/1 Running 0 2d21h
etcd-6mf2nsl2p4 1/1 Running 0 2d21h
etcd-lbmh9xjxm8 1/1 Running 0 2d21h
etcd-operator-7cb5bd459b-tddxt 1/1 Running 0 2d21h
gluster-kube1-0 1/1 Running 1 4m
gluster-kube2-0 1/1 Running 0 2d21h
gluster-kube3-0 1/1 Running 0 2d21h
[vagrant@kube1 ~]$
[vagrant@kube1 ~]$
[vagrant@kube1 ~]$ kubectl describe pods -n gcs gluster-kube1-0
Name: gluster-kube1-0
Namespace: gcs
Priority: 0
PriorityClassName: <none>
Node: kube1/192.168.121.7
Start Time: Fri, 02 Nov 2018 05:39:18 +0000
Labels: app.kubernetes.io/component=glusterfs
app.kubernetes.io/name=glusterd2
app.kubernetes.io/part-of=gcs
controller-revision-hash=gluster-kube1-55bc79f94
statefulset.kubernetes.io/pod-name=gluster-kube1-0
Annotations: <none>
Status: Running
IP: 10.233.64.7
Controlled By: StatefulSet/gluster-kube1
Containers:
glusterd2:
Container ID: docker://a261c3bcb84f993948b0691e199396109985d1bd9d547250476168cfd01a9520
Image: docker.io/gluster/glusterd2-nightly
Image ID: docker-pullable://docker.io/gluster/glusterd2-nightly@sha256:06e42f3354bff80a724007dbc5442349c3a53d31eceb935fd6b3776d6cdcb0fa
Port: <none>
Host Port: <none>
State: Running
Started: Fri, 02 Nov 2018 05:43:08 +0000
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Fri, 02 Nov 2018 05:39:48 +0000
Finished: Fri, 02 Nov 2018 05:43:04 +0000
Ready: True
Restart Count: 1
Liveness: http-get http://:24007/ping delay=10s timeout=1s period=60s #success=1 #failure=3
Environment:
GD2_ETCDENDPOINTS: http://etcd-client.gcs:2379
GD2_CLUSTER_ID: 27056e19-500a-4e7a-b5a9-71f461679196
GD2_CLIENTADDRESS: gluster-kube1-0.glusterd2.gcs:24007
GD2_ENDPOINTS: http://gluster-kube1-0.glusterd2.gcs:24007
GD2_PEERADDRESS: gluster-kube1-0.glusterd2.gcs:24008
GD2_RESTAUTH: false
Mounts:
/dev from gluster-dev (rw)
/run/lvm from gluster-lvm (rw)
/sys/fs/cgroup from gluster-cgroup (ro)
/usr/lib/modules from gluster-kmods (ro)
/var/lib/glusterd2 from glusterd2-statedir (rw)
/var/log/glusterd2 from glusterd2-logdir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-8s2lg (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
gluster-dev:
Type: HostPath (bare host directory volume)
Path: /dev
HostPathType:
gluster-cgroup:
Type: HostPath (bare host directory volume)
Path: /sys/fs/cgroup
HostPathType:
gluster-lvm:
Type: HostPath (bare host directory volume)
Path: /run/lvm
HostPathType:
gluster-kmods:
Type: HostPath (bare host directory volume)
Path: /usr/lib/modules
HostPathType:
glusterd2-statedir:
Type: HostPath (bare host directory volume)
Path: /var/lib/glusterd2
HostPathType: DirectoryOrCreate
glusterd2-logdir:
Type: HostPath (bare host directory volume)
Path: /var/log/glusterd2
HostPathType: DirectoryOrCreate
default-token-8s2lg:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-8s2lg
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 30m default-scheduler Successfully assigned gcs/gluster-kube1-0 to kube1
Warning Unhealthy 27m (x3 over 29m) kubelet, kube1 Liveness probe failed: Get http://10.233.64.7:24007/ping: dial tcp 10.233.64.7:24007: connect: connection refused
Normal Pulling 26m (x2 over 30m) kubelet, kube1 pulling image "docker.io/gluster/glusterd2-nightly"
Normal Killing 26m kubelet, kube1 Killing container with id docker://glusterd2:Container failed liveness probe.. Container will be killed and recreated.
Normal Pulled 26m (x2 over 30m) kubelet, kube1 Successfully pulled image "docker.io/gluster/glusterd2-nightly"
Normal Created 26m (x2 over 30m) kubelet, kube1 Created container
Normal Started 26m (x2 over 30m) kubelet, kube1 Started container
[vagrant@kube1 ~]$
After this restart, were you able to use gd2 on gluster-kube1-0
?
The failed health check caused the restart of the pod, and that was for the same reason you couldn't use glustercli
. We're still left with the question of why...
Do you have any logs from the gd2 container before it was killed for being unhealthy?
@JohnStrunk , I tried to cpature the logs but within that time the container was killed. I can retry the scenario and see if i am able to hit the same thing again and will try to capture the logs in different terminal.
@ksandha can you provide below info
- Logs from the restarted container
/var/log/glusterd2/glusterd2
- kubectl get events
Normal Killing 26m kubelet, kube1 Killing container with id docker://glusterd2:Container failed liveness probe.. Container will be killed and recreated.
my suspicion is this is due to the issue #68