glusterd2 icon indicating copy to clipboard operation
glusterd2 copied to clipboard

Brick process didn't come up on the node post gluster node reboot

Open PrasadDesala opened this issue 5 years ago • 2 comments

Observed behavior

Having a single PVC (without brick-mux enabled), reboot gluster-node-1 and post reboot, brick process on gluster-node-1 is not running.

[root@gluster-kube1-0 /]# glustercli volume status
Volume : pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d
+--------------------------------------+-------------------------------+-----------------------------------------------------------------------------------------+--------+-------+------+
|               BRICK ID               |             HOST              |                                          PATH                                           | ONLINE | PORT  | PID  |
+--------------------------------------+-------------------------------+-----------------------------------------------------------------------------------------+--------+-------+------+
| 9b0246ac-274e-4ed4-822d-bdaf53f91f93 | gluster-kube3-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick1/brick | true   | 32844 | 5199 |
| 4a891bc0-d7f9-4ba7-a1b0-4d18fee9d6d5 | gluster-kube2-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick2/brick | true   | 45057 | 1848 |
| 9b6f1829-76ad-43f0-9f86-2db80ae5b367 | gluster-kube1-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick3/brick | false  |     0 |    0 |
+--------------------------------------+-------------------------------+-----------------------------------------------------------------------------------------+--------+-------+------+

Below messages are continuously seen in glusterd2 logs,

time="2019-01-23 07:51:42.185010" level=info msg="client disconnected" address="10.233.65.5:1017" server=sunrpc source="[server.go:109:sunrpc.(*SunRPC).pruneConn]"
time="2019-01-23 07:51:42.356707" level=info msg="client connected" address="10.233.66.7:1004" server=sunrpc source="[server.go:148:sunrpc.(*SunRPC).acceptLoop]" transport=tcp
time="2019-01-23 07:51:42.358309" level=error msg="registry.SearchByBrickPath() failed for brick" brick=/var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick3/brick error="SearchByBrickPath: port for brick /var/run/glusterd2/bricks/pvc-e78ffdcf-1ee1-11e9-b021-525400e2de2d/subvol1/brick3/brick not found" source="[rpc_prog.go:104:pmap.(*GfPortmap).PortByBrick]"
time="2019-01-23 07:51:42.359247" level=info msg="client disconnected" address="10.233.66.7:1004" server=sunrpc source="[server.go:109:sunrpc.(*SunRPC).pruneConn]"
[root@gluster-kube1-0 /]# glustercli peer status
+--------------------------------------+-----------------+-------------------------------------+-------------------------------------+--------+------+
|                  ID                  |      NAME       |          CLIENT ADDRESSES           |           PEER ADDRESSES            | ONLINE | PID  |
+--------------------------------------+-----------------+-------------------------------------+-------------------------------------+--------+------+
| 07615b77-0be0-4cf7-bfb2-448968404891 | gluster-kube3-0 | gluster-kube3-0.glusterd2.gcs:24007 | gluster-kube3-0.glusterd2.gcs:24008 | yes    | 5092 |
| 5116ce37-c13d-48af-bb32-33f64aa1858d | gluster-kube2-0 | gluster-kube2-0.glusterd2.gcs:24007 | gluster-kube2-0.glusterd2.gcs:24008 | yes    | 1689 |
| e7733b6e-8cb1-41d8-be45-bd639016be06 | gluster-kube1-0 | gluster-kube1-0.glusterd2.gcs:24007 | gluster-kube1-0.glusterd2.gcs:24008 | yes    |   32 |
+--------------------------------------+-----------------+-------------------------------------+-------------------------------------+--------+------+

Expected/desired behavior

Brick process should run on the node post gluster node reboot.

Details on how to reproduce (minimal and precise)

  1. Create a 3 node GCS setup using vagrant.
  2. Create a PVC (brick-mux is not enabled).
  3. Reboot gluster-node-1 and check glustercli volume status on the other gluster nodes.
  4. I have set "systemctl enable glusterd2.service" on the gluster node but for some reason glusterd2 process didn't come up automatically. So, reboot the node again.
  5. This time glusterd2 service started automatically and check glustercli volume status.

Information about the environment:

  • Glusterd2 version used (e.g. v4.1.0 or master): v6.0-dev.115.gitf469248
  • Operating system used:
  • Glusterd2 compiled from sources, as a package (rpm/deb), or container:
  • Using External ETCD: (yes/no, if yes ETCD version): yes
  • If container, which container image:
  • Using kubernetes, openshift, or direct install:
  • If kubernetes/openshift, is gluster running inside kubernetes/openshift or outside: kubernetes

PrasadDesala avatar Jan 23 '19 07:01 PrasadDesala

please use triple backticks(```) while pasting cli output

oshankkumar avatar Jan 23 '19 11:01 oshankkumar

Have we made progress on this? This is currently marked as GCS 1.0 blocker.

atinmu avatar Jan 28 '19 12:01 atinmu