drbd icon indicating copy to clipboard operation
drbd copied to clipboard

DRBD resources change to `Consistent` State after the disk node is powered off and on.

Open Dipanshu-Sehjal opened this issue 2 years ago • 2 comments

From time to time my set of HA testing on DRBD has shown that some DRBD resources go into the "Consistent" state when a node carrying them is powered off and powered on after a prolonged period of time.

Setup 1 -

This happens 7/10 times on the following setup -

3 nodes with 1 disk node and 2 diskless nodes. Replication and auto-eviction are turned off.

Test case procedures -

  1. Shutdown the ONLY disk node and wait for 30 mins before powering on.
  2. linstor resource list shows some resources in Consistent state while their Diskless counterpart is Usage = InUse

There are only 2 replicas associated with a single resource - Diskless and Disk.

As a workaround, kubectl exec -n piraeus <ns_pod-name> -c linstor-satellite -- drbdsetup disconnect <Consistent-state-resource-name> <node-id of diskless peer> kubectl exec -n piraeus <ns_pod-name> -c linstor-satellite -- drbdsetup connect <Consistent-state-resource-name> <node-id of diskless peer>

This brings back the resource into UpToDate state. BUT sometimes this workaround puts the resource into Outdated and then it becomes a totally different problem to solve which I don't know how to recover from when this is the only physical replica available on a cluster and the Diskless resource is connected to it.

Setup 2 -

This issue happens almost 2/10 times on a 3 node cluster with 2 disk nodes and 1 diskless node. Replication is turned on and auto-eviction is turned off.

Test case procedures -

  1. Shutdown any disk node and wait for 30 mins before powering on.
  2. linstor resource list shows some resources in Consistent state. This one is easy to get by because replication is turned on so the other replica becomes Primary and starts to serve data. So, I can use drbdsetup disconnect and connect and even delete this resource when it goes into an Outdated state.

However, it is not straightforward if the application replicates data by itself and does not use DRBD for replication. In other words, DRBD replication is turned off for this resource which goes back to a situation similar to Setup 1

For instance on a node cluster with 1 disk and 2 diskless nodes -

linstor resource list Please see below for a complete set of resources. pvc-ae4bb911-a227-4dde-a81a-2911d1c14aac is marked as Consistent state

k exec --namespace=piraeus deployment/piraeus-op-piraeus-operator-cs-controller -- linstor r l +--------------------------------------------------------------------------------------------------------------------------------+ | ResourceName | Node | Port | Usage | Conns | State | CreatedOn | |================================================================================================================================| | pvc-5bac2d82-9e42-4e0d-b828-3c598f2d8795 | flex188-126.dr.avaya.com | 7009 | Unused | Ok | UpToDate | 2022-03-22 04:58:52 | | pvc-5bac2d82-9e42-4e0d-b828-3c598f2d8795 | flex188-128.dr.avaya.com | 7009 | InUse | Ok | Diskless | 2022-03-22 04:58:53 | | pvc-16c0b34b-bed3-4219-8b9f-415e8d1734fb | flex188-126.dr.avaya.com | 7005 | InUse | Ok | UpToDate | 2022-03-22 04:56:35 | | pvc-19dc5cea-733a-41b3-bd83-a2c4ea5012da | flex188-126.dr.avaya.com | 7004 | InUse | Ok | UpToDate | 2022-03-22 04:56:31 | | pvc-32e7a7bf-f0c2-4bca-941b-102780fcf7bd | flex188-126.dr.avaya.com | 7003 | Unused | Ok | UpToDate | 2022-03-22 05:49:52 | | pvc-32e7a7bf-f0c2-4bca-941b-102780fcf7bd | flex188-128.dr.avaya.com | 7003 | InUse | Ok | Diskless | 2022-03-22 05:49:54 | | pvc-367dca54-39c8-415e-9633-0295730bbd44 | flex188-126.dr.avaya.com | 7002 | InUse | Ok | UpToDate | 2022-03-21 04:47:47 | | pvc-69090137-263d-4cca-b402-02fd5f377041 | flex188-126.dr.avaya.com | 7008 | Unused | Ok | UpToDate | 2022-03-22 04:56:48 | | pvc-69090137-263d-4cca-b402-02fd5f377041 | flex188-128.dr.avaya.com | 7008 | InUse | Ok | Diskless | 2022-03-22 04:56:52 | | pvc-a36f02aa-da1a-4eaf-b797-b035dfcd5a22 | flex188-126.dr.avaya.com | 7006 | Unused | Ok | UpToDate | 2022-03-22 04:56:36 | | pvc-a36f02aa-da1a-4eaf-b797-b035dfcd5a22 | flex188-127.dr.avaya.com | 7006 | InUse | Ok | Diskless | 2022-03-22 04:56:38 | | pvc-ae4bb911-a227-4dde-a81a-2911d1c14aac | flex188-126.dr.avaya.com | 7012 | Unused | Ok | Consistent | 2022-03-22 06:22:05 | | pvc-ae4bb911-a227-4dde-a81a-2911d1c14aac | flex188-128.dr.avaya.com | 7012 | InUse | Ok | Diskless | 2022-03-22 16:20:24 | | pvc-c806c751-1efd-405b-bf63-22c7ffc53ede | flex188-126.dr.avaya.com | 7007 | InUse | Ok | UpToDate | 2022-03-22 04:56:48 | | pvc-cf52d305-926d-40b3-95b8-72c07b623d19 | flex188-126.dr.avaya.com | 7001 | InUse | Ok | UpToDate | 2022-03-21 04:47:46 | | pvc-d39187c2-d560-42db-8dc3-c6e57505ae72 | flex188-126.dr.avaya.com | 7000 | Unused | Ok | UpToDate | 2022-03-21 04:07:50 | | pvc-d39187c2-d560-42db-8dc3-c6e57505ae72 | flex188-127.dr.avaya.com | 7000 | InUse | Ok | Diskless | 2022-03-21 04:07:53 | +--------------------------------------------------------------------------------------------------------------------------------+

k exec -n piraeus piraeus-op-piraeus-operator-ns-node-cv9qs -c linstor-satellite -- drbdadm dstate pvc-ae4bb911-a227-4dde-a81a-2911d1c14aac Consistent/Diskless

k exec -n piraeus piraeus-op-piraeus-operator-ns-node-cv9qs -c linstor-satellite -- drbdadm cstate pvc-ae4bb911-a227-4dde-a81a-2911d1c14aac Connected

k exec -n piraeus piraeus-op-piraeus-operator-ns-node-cv9qs -c linstor-satellite -- drbdadm dump pvc-ae4bb911-a227-4dde-a81a-2911d1c14aac

# resource pvc-ae4bb911-a227-4dde-a81a-2911d1c14aac on flex188-126.dr.avaya.com: not ignored, not stacked
# defined at /var/lib/linstor.d/pvc-ae4bb911-a227-4dde-a81a-2911d1c14aac.res:6
resource pvc-ae4bb911-a227-4dde-a81a-2911d1c14aac {
    on flex188-126.dr.avaya.com {
        node-id 0;
        volume 0 {
            disk {
                discard-zeroes-if-aligned yes;
                rs-discard-granularity 262144;
            }
            device       minor 1012;
            disk         /dev/vg_sds/pvc-ae4bb911-a227-4dde-a81a-2911d1c14aac_00000;
            meta-disk    internal;
        }
    }
    on flex188-128.dr.avaya.com {
        node-id 1;
        volume 0 {
            disk {
                discard-zeroes-if-aligned yes;
                rs-discard-granularity 262144;
            }
            device       minor 1012;
            disk         none;
            meta-disk    internal;
        }
    }
    connection {
        host flex188-126.dr.avaya.com         address         ipv4 10.129.188.126:7012;
        host flex188-128.dr.avaya.com         address         ipv4 10.129.188.128:7012;
        net {
            _name        flex188-128.dr.avaya.com;
        }
    }
    options {
        quorum           off;
    }
    net {
        cram-hmac-alg    sha1;
        shared-secret    "GY+QykHkQDGxatroysuB";
        max-buffers      10000;
        max-epoch-size   10000;
        protocol           C;
        sndbuf-size        0;
        verify-alg       crct10dif-pclmul;
    }
}


Software version -

k exec --namespace=piraeus deployment/piraeus-op-piraeus-operator-cs-controller -- linstor --version linstor 1.13.0; GIT-hash: 840cf57c75c166659509e22447b2c0ca6377ee6d

k exec --namespace=piraeus deployment/piraeus-op-piraeus-operator-cs-controller -- drbdadm -V DRBDADM_BUILDTAG=GIT-hash:\ 087ee6b4961ca154d76e4211223b03149373bed8\ build\ by\ @buildsystem,\ 2022-01-28\ 12:19:33 DRBDADM_API_VERSION=2 DRBD_KERNEL_VERSION_CODE=0x090106 DRBD_KERNEL_VERSION=9.1.6 DRBDADM_VERSION_CODE=0x091402 DRBDADM_VERSION=9.20.2

piraeus-operator-1.8.0

uname -a 4.18.0-348.20.1.el8_5.x86_64 #1 SMP Tue Mar 8 12:56:54 EST 2022 x86_64 x86_64 x86_64 GNU/Linux

Dipanshu-Sehjal avatar Mar 22 '22 19:03 Dipanshu-Sehjal

What is the output of drbdsetup status on the Satellite pod for the node with the Consistent resource in this situation? It may be that linstor has missed the state change, even though it is fine at the DRBD level.

JoelColledge avatar Apr 07 '22 09:04 JoelColledge

Sorry, I don't have the setup now anymore. I will add logs if I see it again. Is there anything else that I should collect next time?

Dipanshu-Sehjal avatar May 26 '22 07:05 Dipanshu-Sehjal