glusterd2
glusterd2 copied to clipboard
Brick process didn't come up on the node post gluster node reboot
Observed behavior
Having a single PVC (without brick-mux enabled), reboot gluster-node-1 and post reboot, brick process on gluster-node-1 is not running.
[root@gluster-kube1-0 /]# glustercli volume status
Volume : pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d
+--------------------------------------+-------------------------------+-----------------------------------------------------------------------------------------+--------+-------+------+
| BRICK ID | HOST | PATH | ONLINE | PORT | PID |
+--------------------------------------+-------------------------------+-----------------------------------------------------------------------------------------+--------+-------+------+
| 9b0246ac-274e-4ed4-822d-bdaf53f91f93 | gluster-kube3-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick1/brick | true | 32844 | 5199 |
| 4a891bc0-d7f9-4ba7-a1b0-4d18fee9d6d5 | gluster-kube2-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick2/brick | true | 45057 | 1848 |
| 9b6f1829-76ad-43f0-9f86-2db80ae5b367 | gluster-kube1-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick3/brick | false | 0 | 0 |
+--------------------------------------+-------------------------------+-----------------------------------------------------------------------------------------+--------+-------+------+
Below messages are continuously seen in glusterd2 logs,
time="2019-01-23 07:51:42.185010" level=info msg="client disconnected" address="10.233.65.5:1017" server=sunrpc source="[server.go:109:sunrpc.(*SunRPC).pruneConn]"
time="2019-01-23 07:51:42.356707" level=info msg="client connected" address="10.233.66.7:1004" server=sunrpc source="[server.go:148:sunrpc.(*SunRPC).acceptLoop]" transport=tcp
time="2019-01-23 07:51:42.358309" level=error msg="registry.SearchByBrickPath() failed for brick" brick=/var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick3/brick error="SearchByBrickPath: port for brick /var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick3/brick not found" source="[rpc_prog.go:104:pmap.(*GfPortmap).PortByBrick]"
time="2019-01-23 07:51:42.359247" level=info msg="client disconnected" address="10.233.66.7:1004" server=sunrpc source="[server.go:109:sunrpc.(*SunRPC).pruneConn]"
[root@gluster-kube1-0 /]# glustercli peer status
+--------------------------------------+-----------------+-------------------------------------+-------------------------------------+--------+------+
| ID | NAME | CLIENT ADDRESSES | PEER ADDRESSES | ONLINE | PID |
+--------------------------------------+-----------------+-------------------------------------+-------------------------------------+--------+------+
| 07615b77-0be0-4cf7-bfb2-448968404891 | gluster-kube3-0 | gluster-kube3-0.glusterd2.gcs:24007 | gluster-kube3-0.glusterd2.gcs:24008 | yes | 5092 |
| 5116ce37-c13d-48af-bb32-33f64aa1858d | gluster-kube2-0 | gluster-kube2-0.glusterd2.gcs:24007 | gluster-kube2-0.glusterd2.gcs:24008 | yes | 1689 |
| e7733b6e-8cb1-41d8-be45-bd639016be06 | gluster-kube1-0 | gluster-kube1-0.glusterd2.gcs:24007 | gluster-kube1-0.glusterd2.gcs:24008 | yes | 32 |
+--------------------------------------+-----------------+-------------------------------------+-------------------------------------+--------+------+
Expected/desired behavior
Brick process should run on the node post gluster node reboot.
Details on how to reproduce (minimal and precise)
- Create a 3 node GCS setup using vagrant.
- Create a PVC (brick-mux is not enabled).
- Reboot gluster-node-1 and check glustercli volume status on the other gluster nodes.
- I have set "systemctl enable glusterd2.service" on the gluster node but for some reason glusterd2 process didn't come up automatically. So, reboot the node again.
- This time glusterd2 service started automatically and check glustercli volume status.
Information about the environment:
- Glusterd2 version used (e.g. v4.1.0 or master): v6.0-dev.115.gitf469248
- Operating system used:
- Glusterd2 compiled from sources, as a package (rpm/deb), or container:
- Using External ETCD: (yes/no, if yes ETCD version): yes
- If container, which container image:
- Using kubernetes, openshift, or direct install:
- If kubernetes/openshift, is gluster running inside kubernetes/openshift or outside: kubernetes
please use triple backticks(```) while pasting cli output
Have we made progress on this? This is currently marked as GCS 1.0 blocker.