linstor-csi
linstor-csi copied to clipboard
Promotion of secondaries delayed/not happening
We are testing a system with 2 worker nodes and 1 further node acting as a tie-breaker.
This issue is twofold:
- If many pvcs are created at once (in this example we're creating 6 at once) many of the pvcs end up with all nodes being secondary, it can then take up to an hour for any nodes to be promoted to primary, during this time the pods depending on the pvcs are stuck with "ContainerCreating" status.
- If we experience a node outage where the primary fails, it can take up to 1 hour for all pvcs to switch from secondary, any pods which have been rescheduled are stuck in "ContainerCreating" during this time.
According to https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/#s-automatic-promotion automatic promotion is now baked into drbd, however https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/#s-feature-failover-clusters says this should be done with
a cluster resource manager (CRM) such as DRBD Reactor and its promoter plug-in
and further https://linbit.com/drbd-user-guide/linstor-guide-1_0-en/#s-linstor_ha says that for HA auto-promote
should be disabled.
I can see drbd-reactor
is running in the drbd-prometheus-exporter
container on the satellites, but it doesn't seem to be configured with the promoter plug-in.
The resources are defined as follows:
╭──────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Key ┊ Value ┊
╞══════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ Aux/csi-provisioning-completed-by ┊ linstor-csi/v0.20.0-d514e41db7cdcb580769cc69f1c1ef2b8a5def5d ┊
┊ DrbdOptions/Resource/on-no-quorum ┊ suspend-io ┊
┊ DrbdOptions/Resource/quorum ┊ majority ┊
┊ DrbdOptions/auto-verify-alg ┊ crct10dif-pclmul ┊
┊ DrbdPrimarySetOn ┊ DEDI1-NODE1.23-106-60-155.LON-01.UK ┊
┊ FileSystem/MkfsParams ┊ ┊
┊ FileSystem/Type ┊ xfs ┊
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
and the storage class is:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: team-100-persistent-replicated
provisioner: linstor.csi.linbit.com
parameters:
storagePool: "team-100"
resourceGroup: "team-100-persistent-replicated"
csi.storage.k8s.io/fstype: xfs
nodeList: "dedi1-node1.23-106-60-155.lon-01.uk vm9-node2.23-106-61-193.lon-01.uk"
placementPolicy: "Manual"
property.linstor.csi.linbit.com/DrbdOptions/auto-quorum: suspend-io
property.linstor.csi.linbit.com/DrbdOptions/Resource/on-no-data-accessible: suspend-io
property.linstor.csi.linbit.com/DrbdOptions/Resource/on-suspended-primary-outdated: force-secondary
property.linstor.csi.linbit.com/DrbdOptions/Net/rr-conflict: retry-connect
Satellites are configured with zfs pools as so:
satelliteSet:
enabled: true
satelliteImage: quay.io/piraeusdatastore/piraeus-server:v1.19.1
storagePools:
zfsPools:
- name: team-100
zPool: rpool/team-100
thin: true
sslSecret: ""
automaticStorageType: None
affinity: {}
tolerations: []
resources: {}
monitoringImage: quay.io/piraeusdatastore/drbd-reactor:v0.8.0
monitoringBindAddress: ""
kernelModuleInjectionImage: quay.io/piraeusdatastore/drbd9-jammy:v9.1.7
kernelModuleInjectionMode: Compile
kernelModuleInjectionAdditionalSourceDirectory: ""
kernelModuleInjectionResources: {}
kernelModuleInjectionExtraVolumeMounts: []
additionalEnv: []
sidecars: []
extraVolumes: []
customLabels: {}
customAnnotations: {}
So I'm not sure if auto-promote
is enabled by default (the man page says it is), and if so how to configure it to promote faster?
Here's an example of the drbd state when adding 6 pvcs at once:
pvc-210c18f6-9a43-4b94-8903-6d3767445044 role:Primary
disk:UpToDate
dedi1-node1.23-106-60-155.lon-01.uk role:Secondary
peer-disk:UpToDate
vm6-cplane1.23-106-61-231.lon-01.uk role:Secondary
peer-disk:Diskless peer-client:yes
pvc-23a1dba9-25ab-44c0-a077-9604144d097c role:Secondary
disk:UpToDate
dedi1-node1.23-106-60-155.lon-01.uk role:Secondary
peer-disk:UpToDate
vm6-cplane1.23-106-61-231.lon-01.uk role:Secondary
peer-disk:Diskless peer-client:yes
pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483 role:Secondary
disk:UpToDate
dedi1-node1.23-106-60-155.lon-01.uk role:Secondary
peer-disk:UpToDate
vm6-cplane1.23-106-61-231.lon-01.uk role:Secondary
peer-disk:Diskless peer-client:yes
pvc-81da7546-1393-4931-9080-b90cd058709d role:Primary
disk:UpToDate
dedi1-node1.23-106-60-155.lon-01.uk role:Secondary
peer-disk:UpToDate
vm6-cplane1.23-106-61-231.lon-01.uk role:Secondary
peer-disk:Diskless peer-client:yes
pvc-89bf316f-a382-4d2b-9b72-b0151fd78d1b role:Secondary
disk:UpToDate
dedi1-node1.23-106-60-155.lon-01.uk role:Secondary
peer-disk:UpToDate
vm6-cplane1.23-106-61-231.lon-01.uk role:Secondary
peer-disk:Diskless peer-client:yes
pvc-9fe0df48-e921-4e5e-8fed-21d2e76c3423 role:Secondary
disk:UpToDate
dedi1-node1.23-106-60-155.lon-01.uk role:Secondary
peer-disk:UpToDate
vm6-cplane1.23-106-61-231.lon-01.uk role:Secondary
peer-disk:Diskless peer-client:yes
pvc-a568b69c-b5a2-48e0-b64a-c2c82c64baf3 role:Primary
disk:UpToDate
dedi1-node1.23-106-60-155.lon-01.uk role:Secondary
peer-disk:UpToDate
vm6-cplane1.23-106-61-231.lon-01.uk role:Secondary
peer-disk:Diskless peer-client:yes
pvc-bcbc88d7-2dfa-462b-9f76-1ece7e945abc role:Secondary
disk:UpToDate
dedi1-node1.23-106-60-155.lon-01.uk role:Secondary
peer-disk:UpToDate
vm6-cplane1.23-106-61-231.lon-01.uk role:Secondary
peer-disk:Diskless peer-client:yes
pvc-ccc2551c-f7d6-44ae-a148-7b567c5a5801 role:Secondary
disk:UpToDate
dedi1-node1.23-106-60-155.lon-01.uk role:Secondary
peer-disk:UpToDate
vm6-cplane1.23-106-61-231.lon-01.uk role:Secondary
peer-disk:Diskless peer-client:yes
Eventually they are promoted, the only logs I can find relating to this are in /var/log/kern.log
(I cannot find any related logs on any of the piraeus containers):
Aug 19 14:52:19 dedi1-node1 kernel: [ 2420.548756] drbd pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483: Preparing cluster-wide state change 2815340248 (0->-1 3/1)
Aug 19 14:52:19 dedi1-node1 kernel: [ 2420.549358] drbd pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483: State change 2815340248: primary_nodes=1, weak_nodes=FFFFFFFFFFFFFFF8
Aug 19 14:52:19 dedi1-node1 kernel: [ 2420.549363] drbd pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483: Committing cluster-wide state change 2815340248 (0ms)
Aug 19 14:52:19 dedi1-node1 kernel: [ 2420.549372] drbd pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483: role( Secondary -> Primary )
Aug 19 14:52:19 dedi1-node1 kernel: [ 2420.552361] XFS (drbd1009): Mounting V5 Filesystem
Aug 19 14:52:19 dedi1-node1 kernel: [ 2420.575686] XFS (drbd1009): Ending clean mount
Aug 19 14:52:19 dedi1-node1 kernel: [ 2420.576200] xfs filesystem being mounted at /var/lib/kubelet/pods/8424cf60-4d08-428c-91d6-cca08f5b9c2e/volumes/kubernetes.io~csi/pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483/mount supports timestamps until 2038 (0x7fffffff)
It appears I got the causation wrong, it appears something else is causing the delay, and once whatever is causing the AttachVolume.Attach
to stop timing out, the pvc is promoted to primary (the docs say a node is promoted to primary automatically upon successful mount, which could be the reason).
I came to this conclusion as manually forcing the node to be primary does not speed up AttachVolume.Attach
in the pods:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 23m (x2 over 30m) kubelet Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[kube-api-access-flfr4 data]: timed out waiting for the condition
Warning FailedAttachVolume 108s (x37 over 63m) attachdetach-controller AttachVolume.Attach failed for volume "pvc-ccc2551c-f7d6-44ae-a148-7b567c5a5801" : rpc error: code = DeadlineExceeded desc = context deadline exceeded
Warning FailedMount 42s (x26 over 61m) kubelet Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[data kube-api-access-flfr4]: timed out waiting for the condition
The csi-attacher
container in the piraeus-op-csi-controller
is full of the following repeating until the volume attaches sucessfully:
W0819 16:04:34.774528 1 csi_handler.go:189] VA csi-d9e810a254b0ffa73804d0480a395b9e575625dd0482256f9213db53b5fc908b for volume pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483 has attached status false but actual state true. Adding back to VA queue for forced reprocessing
I0819 16:05:26.028697 1 connection.go:183] GRPC call: /csi.v1.Controller/ControllerPublishVolume
I0819 16:05:26.028697 1 connection.go:184] GRPC request: {"node_id":"vm9-node2.23-106-61-193.lon-01.uk","volume_capability":{"AccessType":{"Mount":{"fs_type":"xfs"}},"access_mode":{"mode":5}},"volume_context":{"linstor.csi.linbit.com/mount-options":"","linstor.csi.linbit.com/post-mount-xfs-opts":"","linstor.csi.linbit.com/remote-access-policy":"true","linstor.csi.linbit.com/uses-volume-context":"true","storage.kubernetes.io/csiProvisionerIdentity":"1660915975988-8081-linstor.csi.linbit.com"},"volume_id":"pvc-ccc2551c-f7d6-44ae-a148-7b567c5a5801"}
I0819 16:05:29.736469 1 connection.go:186] GRPC response: {}
I0819 16:05:29.736559 1 connection.go:187] GRPC error: rpc error: code = Internal desc = ControllerPublishVolume failed for pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483: failed to check existing assignment: context deadline exceeded
I0819 16:05:29.736617 1 csi_handler.go:591] Saving attach error to "csi-d9e810a254b0ffa73804d0480a395b9e575625dd0482256f9213db53b5fc908b"
I0819 16:05:29.747875 1 csi_handler.go:602] Saved attach error to "csi-d9e810a254b0ffa73804d0480a395b9e575625dd0482256f9213db53b5fc908b"
I0819 16:05:29.747914 1 controller.go:167] Ignoring VolumeAttachment "csi-d9e810a254b0ffa73804d0480a395b9e575625dd0482256f9213db53b5fc908b" change
I0819 16:05:29.747960 1 csi_handler.go:234] Error processing "csi-d9e810a254b0ffa73804d0480a395b9e575625dd0482256f9213db53b5fc908b": failed to attach: rpc error: code = Internal desc = ControllerPublishVolume failed for pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483: failed to check existing assignment: context deadline exceeded
I0819 16:05:29.748042 1 controller.go:210] Started VA processing "csi-d9e810a254b0ffa73804d0480a395b9e575625dd0482256f9213db53b5fc908b"
I0819 16:05:29.748066 1 csi_handler.go:224] CSIHandler: processing VA "csi-d9e810a254b0ffa73804d0480a395b9e575625dd0482256f9213db53b5fc908b"
I0819 16:05:29.748080 1 csi_handler.go:251] Attaching "csi-d9e810a254b0ffa73804d0480a395b9e575625dd0482256f9213db53b5fc908b"
I0819 16:05:29.748098 1 csi_handler.go:421] Starting attach operation for "csi-d9e810a254b0ffa73804d0480a395b9e575625dd0482256f9213db53b5fc908b"
I0819 16:05:29.748145 1 csi_handler.go:335] PV finalizer is already set on "pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483"
I0819 16:05:29.748187 1 csi_handler.go:740] Found NodeID vm9-node2.23-106-61-193.lon-01.uk in CSINode vm9-node2.23-106-61-193.lon-01.uk
I0819 16:05:29.748207 1 csi_handler.go:304] VA finalizer is already set on "csi-d9e810a254b0ffa73804d0480a395b9e575625dd0482256f9213db53b5fc908b"
I0819 16:05:29.748220 1 csi_handler.go:318] NodeID annotation is already set on "csi-d9e810a254b0ffa73804d0480a395b9e575625dd0482256f9213db53b5fc908b"
I0819 16:41:36.919246 1 connection.go:183] GRPC call: /csi.v1.Controller/ListVolumes
I0819 16:41:36.919267 1 connection.go:184] GRPC request: {}
I0819 16:41:36.972002 1 connection.go:186] GRPC response: {"entries":[{"status":{"published_node_ids":["dedi1-node1.23-106-60-155.lon-01.uk","vm6-cplane1.23-106-61-231.lon-01.uk","vm9-node2.23-106-61-193.lon-01.uk"],"volume_condition":{"abnormal":true,"message":"Resource with issues on node(s): dedi1-node1.23-106-60-155.lon-01.uk"}},"volume":{"capacity_bytes":2147483648,"volume_id":"pvc-210c18f6-9a43-4b94-8903-6d3767445044"}},{"status":{"volume_condition":{"message":"Volume healthy"}},"volume":{"capacity_bytes":11811160064,"volume_id":"pvc-21c4f64e-cc21-46b0-9403-d8c3959c2095"}},{"status":{"published_node_ids":["dedi1-node1.23-106-60-155.lon-01.uk","vm6-cplane1.23-106-61-231.lon-01.uk","vm9-node2.23-106-61-193.lon-01.uk"],"volume_condition":{"abnormal":true,"message":"Resource with issues on node(s): dedi1-node1.23-106-60-155.lon-01.uk"}},"volume":{"capacity_bytes":10737418240,"volume_id":"pvc-23a1dba9-25ab-44c0-a077-9604144d097c"}},{"status":{"volume_condition":{"message":"Volume healthy"}},"volume":{"capacity_bytes":2147483648,"volume_id":"pvc-27f173d8-af36-4f44-8dc3-59ba86039075"}},{"status":{"published_node_ids":["dedi1-node1.23-106-60-155.lon-01.uk","vm6-cplane1.23-106-61-231.lon-01.uk","vm9-node2.23-106-61-193.lon-01.uk"],"volume_condition":{"abnormal":true,"message":"Resource with issues on node(s): dedi1-node1.23-106-60-155.lon-01.uk"}},"volume":{"capacity_bytes":209715200,"volume_id":"pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483"}},{"status":{"published_node_ids":["dedi1-node1.23-106-60-155.lon-01.uk","vm6-cplane1.23-106-61-231.lon-01.uk","vm9-node2.23-106-61-193.lon-01.uk"],"volume_condition":{"abnormal":true,"message":"Resource with issues on node(s): dedi1-node1.23-106-60-155.lon-01.uk"}},"volume":{"capacity_bytes":11811160064,"volume_id":"pvc-81da7546-1393-4931-9080-b90cd058709d"}},{"status":{"published_node_ids":["dedi1-node1.23-106-60-155.lon-01.uk","vm6-cplane1.23-106-61-231.lon-01.uk","vm9-node2.23-106-61-193.lon-01.uk"],"volume_condition":{"abnormal":true,"message":"Resource with issues on node(s): dedi1-node1.23-106-60-155.lon-01.uk"}},"volume":{"capacity_bytes":209715200,"volume_id":"pvc-89bf316f-a382-4d2b-9b72-b0151fd78d1b"}},{"status":{"published_node_ids":["dedi1-node1.23-106-60-155.lon-01.uk","vm6-cplane1.23-106-61-231.lon-01.uk","vm9-node2.23-106-61-193.lon-01.uk"],"volume_condition":{"abnormal":true,"message":"Resource with issues on node(s): dedi1-node1.23-106-60-155.lon-01.uk"}},"volume":{"capacity_bytes":2147483648,"volume_id":"pvc-9fe0df48-e921-4e5e-8fed-21d2e76c3423"}},{"status":{"published_node_ids":["dedi1-node1.23-106-60-155.lon-01.uk","vm6-cplane1.23-106-61-231.lon-01.uk","vm9-node2.23-106-61-193.lon-01.uk"],"volume_condition":{"abnormal":true,"message":"Resource with issues on node(s): dedi1-node1.23-106-60-155.lon-01.uk"}},"volume":{"capacity_bytes":2147483648,"volume_id":"pvc-a568b69c-b5a2-48e0-b64a-c2c82c64baf3"}},{"status":{"published_node_ids":["dedi1-node1.23-106-60-155.lon-01.uk","vm6-cplane1.23-106-61-231.lon-01.uk","vm9-node2.23-106-61-193.lon-01.uk"],"volume_condition":{"abnormal":true,"message":"Resource with issues on node(s): dedi1-node1.23-106-60-155.lon-01.uk"}},"volume":{"capacity_bytes":11811160064,"volume_id":"pvc-bcbc88d7-2dfa-462b-9f76-1ece7e945abc"}},{"status":{"published_node_ids":["dedi1-node1.23-106-60-155.lon-01.uk","vm6-cplane1.23-106-61-231.lon-01.uk","vm9-node2.23-106-61-193.lon-01.uk"],"volume_condition":{"abnormal":true,"message":"Resource with issues on node(s): dedi1-node1.23-106-60-155.lon-01.uk"}},"volume":{"capacity_bytes":22548578304,"volume_id":"pvc-ccc2551c-f7d6-44ae-a148-7b567c5a5801"}}]}
I0819 16:41:36.972761 1 connection.go:187] GRPC error: <nil>
The linstor-csi-plugin
container in the piraeus-op-csi-controller
is full of the following repeating until the volume attaches sucessfully:
time="2022-08-19T16:20:29Z" level=error msg="method failed" error="rpc error: code = Internal desc = ControllerPublishVolume failed for pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483: failed to check existing assignment: context deadline exceeded" linstorCSIComponent=driver method=/csi.v1.Controller/ControllerPublishVolume nodeID=vm9-node2.23-106-61-193.lon-01.uk provisioner=linstor.csi.linbit.com req="volume_id:\"pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483\" node_id:\"vm9-node2.23-106-61-193.lon-01.uk\" volume_capability:<mount:<fs_type:\"xfs\" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > > volume_context:<key:\"linstor.csi.linbit.com/mount-options\" value:\"\" > volume_context:<key:\"linstor.csi.linbit.com/post-mount-xfs-opts\" value:\"\" > volume_context:<key:\"linstor.csi.linbit.com/remote-access-policy\" value:\"true\" > volume_context:<key:\"linstor.csi.linbit.com/uses-volume-context\" value:\"true\" > volume_context:<key:\"storage.kubernetes.io/csiProvisionerIdentity\" value:\"1660915975988-8081-linstor.csi.linbit.com\" > " resp="<nil>" version=v0.20.0-d514e41db7cdcb580769cc69f1c1ef2b8a5def5d
Here's the VolumeAttachment
resource:
Name: csi-d9e810a254b0ffa73804d0480a395b9e575625dd0482256f9213db53b5fc908b
Namespace:
Labels: <none>
Annotations: csi.alpha.kubernetes.io/node-id: vm9-node2.23-106-61-193.lon-01.uk
API Version: storage.k8s.io/v1
Kind: VolumeAttachment
Metadata:
Creation Timestamp: 2022-08-19T15:48:29Z
Finalizers:
external-attacher/linstor-csi-linbit-com
Managed Fields:
API Version: storage.k8s.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:csi.alpha.kubernetes.io/node-id:
f:finalizers:
.:
v:"external-attacher/linstor-csi-linbit-com":
Manager: csi-attacher
Operation: Update
Time: 2022-08-19T15:48:29Z
API Version: storage.k8s.io/v1
Fields Type: FieldsV1
fieldsV1:
f:spec:
f:attacher:
f:nodeName:
f:source:
f:persistentVolumeName:
Manager: kube-controller-manager
Operation: Update
Time: 2022-08-19T15:48:29Z
API Version: storage.k8s.io/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
f:attachError:
.:
f:message:
f:time:
Manager: csi-attacher
Operation: Update
Subresource: status
Time: 2022-08-19T15:49:29Z
Resource Version: 67251623
UID: 461b03ea-fd77-4567-b6b8-b91e8f302fab
Spec:
Attacher: linstor.csi.linbit.com
Node Name: vm9-node2.23-106-61-193.lon-01.uk
Source:
Persistent Volume Name: pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483
Status:
Attach Error:
Message: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Time: 2022-08-19T16:11:29Z
Attached: false
Events: <none>
Here's the linstor resource file:
cat /var/lib/linstor.d/pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483.res
# This file was generated by linstor(1.19.1), do not edit manually.
# Local node: vm9-node2.23-106-61-193.lon-01.uk
# Host name : vm9-node2.23-106-61-193.lon-01.uk
resource "pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483"
{
options
{
on-no-data-accessible suspend-io;
on-no-quorum suspend-io;
on-suspended-primary-outdated force-secondary;
quorum majority;
}
net
{
cram-hmac-alg sha1;
shared-secret "eJAAPCGDJHtnovtB/VOM";
rr-conflict retry-connect;
verify-alg "crct10dif-pclmul";
}
on vm9-node2.23-106-61-193.lon-01.uk
{
volume 0
{
disk /dev/zvol/rpool/team-100/pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483_00000;
disk
{
discard-zeroes-if-aligned yes;
rs-discard-granularity 8192;
}
meta-disk internal;
device minor 1009;
}
node-id 1;
}
on dedi1-node1.23-106-60-155.lon-01.uk
{
volume 0
{
disk /dev/drbd/this/is/not/used;
disk
{
discard-zeroes-if-aligned yes;
rs-discard-granularity 8192;
}
meta-disk internal;
device minor 1009;
}
node-id 0;
}
on vm6-cplane1.23-106-61-231.lon-01.uk
{
volume 0
{
disk none;
disk
{
discard-zeroes-if-aligned yes;
rs-discard-granularity 8192;
}
meta-disk internal;
device minor 1009;
}
node-id 2;
}
connection
{
host vm9-node2.23-106-61-193.lon-01.uk address ipv4 23.106.61.193:7009;
host dedi1-node1.23-106-60-155.lon-01.uk address ipv4 23.106.60.155:7009;
}
connection
{
host vm9-node2.23-106-61-193.lon-01.uk address ipv4 23.106.61.193:7009;
host vm6-cplane1.23-106-61-231.lon-01.uk address ipv4 23.106.61.231:7009;
}
}
I can successfully mount the volume manually:
# mount /dev/drbd1009 /tmp/test
# ls -la /tmp/test
total 41
drwxr-xr-x 5 root root 149 Aug 19 14:52 .
drwxrwxrwt 11 root root 11 Aug 19 17:59 ..
-rw-r--r-- 1 root root 6902 Aug 19 15:47 ganesha.log
-rw------- 1 root root 36 Aug 19 14:52 nfs-provisioner.identity
drwxrwsrwx 5 systemd-coredump systemd-coredump 44 Aug 19 14:53 pvc-5199abc1-5c27-4940-8a68-52bf1febfd47
drwxr-xr-x 3 root root 19 Aug 19 14:52 v4old
drwxr-xr-x 3 root root 19 Aug 19 14:52 v4recov
-rw------- 1 root root 922 Aug 19 14:52 vfs.conf
linstor node list
╭───────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ NodeType ┊ Addresses ┊ State ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ SATELLITE ┊ 23.106.60.155:3367 (SSL) ┊ EVICTED ┊
┊ piraeus-op-cs-controller-5dcf464ff-spzdh ┊ CONTROLLER ┊ 10.0.2.141:3367 (SSL) ┊ OFFLINE ┊
┊ piraeus-op-cs-controller-5dcf464ff-vd8lx ┊ CONTROLLER ┊ 10.0.1.24:3367 (SSL) ┊ Online ┊
┊ piraeus-op-cs-controller-5dcf464ff-wpx95 ┊ CONTROLLER ┊ 10.0.4.166:3367 (SSL) ┊ OFFLINE ┊
┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ SATELLITE ┊ 23.106.61.231:3367 (SSL) ┊ Online ┊
┊ vm7-cplane2.23-106-61-232.lon-01.uk ┊ SATELLITE ┊ 23.106.61.232:3367 (SSL) ┊ Online ┊
┊ vm9-node2.23-106-61-193.lon-01.uk ┊ SATELLITE ┊ 23.106.61.193:3367 (SSL) ┊ Online ┊
╰───────────────────────────────────────────────────────────────────────────────────────────────────╯
Some commands are taking a long time to complete:
time linstor resource list
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node ┊ Port ┊ Usage ┊ Conns ┊ State ┊ CreatedOn ┊
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483 ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7009 ┊ ┊ Ok ┊ INACTIVE ┊ 2022-08-19 14:32:30 ┊
┊ pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483 ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7009 ┊ Unused ┊ Ok ┊ TieBreaker ┊ 2022-08-19 14:31:59 ┊
┊ pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483 ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7009 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2022-08-19 14:33:13 ┊
┊ pvc-9fe0df48-e921-4e5e-8fed-21d2e76c3423 ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7010 ┊ ┊ Ok ┊ INACTIVE ┊ 2022-08-19 14:33:50 ┊
┊ pvc-9fe0df48-e921-4e5e-8fed-21d2e76c3423 ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7010 ┊ Unused ┊ Ok ┊ TieBreaker ┊ 2022-08-19 14:36:49 ┊
┊ pvc-9fe0df48-e921-4e5e-8fed-21d2e76c3423 ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7010 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2022-08-19 14:37:05 ┊
┊ pvc-23a1dba9-25ab-44c0-a077-9604144d097c ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7006 ┊ ┊ Ok ┊ INACTIVE ┊ 2022-08-19 14:28:58 ┊
┊ pvc-23a1dba9-25ab-44c0-a077-9604144d097c ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7006 ┊ Unused ┊ Ok ┊ TieBreaker ┊ 2022-08-19 14:29:44 ┊
┊ pvc-23a1dba9-25ab-44c0-a077-9604144d097c ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7006 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2022-08-19 14:30:29 ┊
┊ pvc-81da7546-1393-4931-9080-b90cd058709d ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7000 ┊ ┊ Ok ┊ INACTIVE ┊ 2022-08-19 13:53:57 ┊
┊ pvc-81da7546-1393-4931-9080-b90cd058709d ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7000 ┊ Unused ┊ Ok ┊ TieBreaker ┊ 2022-08-19 13:54:07 ┊
┊ pvc-81da7546-1393-4931-9080-b90cd058709d ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7000 ┊ InUse ┊ Ok ┊ UpToDate ┊ 2022-08-19 13:54:23 ┊
┊ pvc-89bf316f-a382-4d2b-9b72-b0151fd78d1b ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7008 ┊ ┊ Ok ┊ INACTIVE ┊ 2022-08-19 14:32:30 ┊
┊ pvc-89bf316f-a382-4d2b-9b72-b0151fd78d1b ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7008 ┊ Unused ┊ Ok ┊ TieBreaker ┊ 2022-08-19 14:31:51 ┊
┊ pvc-89bf316f-a382-4d2b-9b72-b0151fd78d1b ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7008 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2022-08-19 14:32:35 ┊
┊ pvc-210c18f6-9a43-4b94-8903-6d3767445044 ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7003 ┊ ┊ Ok ┊ INACTIVE ┊ 2022-08-19 13:54:18 ┊
┊ pvc-210c18f6-9a43-4b94-8903-6d3767445044 ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7003 ┊ Unused ┊ Ok ┊ TieBreaker ┊ 2022-08-19 13:54:34 ┊
┊ pvc-210c18f6-9a43-4b94-8903-6d3767445044 ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7003 ┊ InUse ┊ Ok ┊ UpToDate ┊ 2022-08-19 13:55:03 ┊
┊ pvc-a568b69c-b5a2-48e0-b64a-c2c82c64baf3 ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7004 ┊ ┊ Ok ┊ INACTIVE ┊ 2022-08-19 13:54:18 ┊
┊ pvc-a568b69c-b5a2-48e0-b64a-c2c82c64baf3 ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7004 ┊ Unused ┊ Ok ┊ TieBreaker ┊ 2022-08-19 13:54:41 ┊
┊ pvc-a568b69c-b5a2-48e0-b64a-c2c82c64baf3 ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7004 ┊ InUse ┊ Ok ┊ UpToDate ┊ 2022-08-19 13:55:36 ┊
┊ pvc-bcbc88d7-2dfa-462b-9f76-1ece7e945abc ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7005 ┊ ┊ Ok ┊ INACTIVE ┊ 2022-08-19 14:28:10 ┊
┊ pvc-bcbc88d7-2dfa-462b-9f76-1ece7e945abc ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7005 ┊ Unused ┊ Ok ┊ TieBreaker ┊ 2022-08-19 14:28:22 ┊
┊ pvc-bcbc88d7-2dfa-462b-9f76-1ece7e945abc ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7005 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2022-08-19 14:28:39 ┊
┊ pvc-ccc2551c-f7d6-44ae-a148-7b567c5a5801 ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7007 ┊ ┊ Ok ┊ INACTIVE ┊ 2022-08-19 14:30:15 ┊
┊ pvc-ccc2551c-f7d6-44ae-a148-7b567c5a5801 ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7007 ┊ Unused ┊ Ok ┊ TieBreaker ┊ 2022-08-19 14:31:09 ┊
┊ pvc-ccc2551c-f7d6-44ae-a148-7b567c5a5801 ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7007 ┊ Unused ┊ Ok ┊ UpToDate ┊ 2022-08-19 14:31:38 ┊
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
________________________________________________________
Executed in 87.24 secs fish external
usr time 254.74 millis 783.00 micros 253.96 millis
sys time 112.38 millis 421.00 micros 111.96 millis
Further info:
kubectl version | grep Server
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.1", GitCommit:"3ddd0f45aa91e2f30c70734b175631bec5b5825a", GitTreeState:"clean", BuildDate:"2022-05-24T12:18:48Z", GoVersion:"go1.18.2", Compiler:"gc", Platform:"linux/amd64"}
grep Image: /root/k8s-config/manifests/piraeus.yaml
storkImage: docker.io/openstorage/stork:2.8.2
schedulerImage: k8s.gcr.io/kube-scheduler-amd64
pluginImage: quay.io/piraeusdatastore/piraeus-csi:v0.20.0
csiAttacherImage: k8s.gcr.io/sig-storage/csi-attacher:v3.5.0
csiLivenessProbeImage: k8s.gcr.io/sig-storage/livenessprobe:v2.7.0
csiNodeDriverRegistrarImage: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.5.1
csiProvisionerImage: k8s.gcr.io/sig-storage/csi-provisioner:v3.2.1
csiSnapshotterImage: k8s.gcr.io/sig-storage/csi-snapshotter:v6.0.1
csiResizerImage: k8s.gcr.io/sig-storage/csi-resizer:v1.5.0
controllerImage: quay.io/piraeusdatastore/piraeus-server:v1.19.1
satelliteImage: quay.io/piraeusdatastore/piraeus-server:v1.19.1
monitoringImage: quay.io/piraeusdatastore/drbd-reactor:v0.8.0
kernelModuleInjectionImage: quay.io/piraeusdatastore/drbd9-jammy:v9.1.7
linstor err l
contains no errors.
Ubuntu 22.04 LTS \n \l
Linux vm9-node2.23-106-61-193.lon-01.uk 5.15.0-46-generic #49-Ubuntu SMP Thu Aug 4 18:03:25 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
drbd enabled through injector
[ 629.222020] drbd: module verification failed: signature and/or required key missing - tainting kernel
[ 629.230562] drbd: initialized. Version: 9.1.7 (api:2/proto:110-121)
[ 629.230564] drbd: GIT-hash: bfd2450739e3e27cfd0a2eece2cde3d94ad993ae build by @vm9-node2.23-106-61-193.lon-01.uk.appsolo.com, 2022-08-18 20:28:16
[ 629.230564] drbd: registered as block device major 147
Also attached log from kubectl -n default logs --since=24h pod/piraeus-op-csi-controller-5f6cd85d8b-244hf --all-containers
linstor.log
Attached output from linstor -m resource-definition list
res.json.log
On the surviving storage node, a zfs list process seems to keep spawning every second or so using 100% CPU:
1 520310 520310 5973 ? -1 Sl 0 0:27 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id eeea8b4990656ec8e8576ad3d320b573d703b53723e4511eec2cd4d9c84b25a5 -address /run/containerd/containerd.sock
520310 520341 520341 520341 ? -1 Ss 65535 0:00 \_ /pause
520310 521614 521614 521614 ? -1 Ssl 0 0:01 \_ /usr/bin/k8s-await-election /usr/bin/piraeus-entry.sh startSatellite
521614 521746 521614 521614 ? -1 S 0 0:00 | \_ /bin/sh /usr/bin/piraeus-entry.sh startSatellite
521746 522206 521614 521614 ? -1 Sl 0 3:04 | \_ java -Xms32M -XX:MinHeapFreeRatio=40 -XX:MaxHeapFreeRatio=70 --illegal-access=permit -classpath /usr/share/linstor-server/lib/conf:/usr/share/linstor-server/lib/* com.linbit.linstor.core.Satellite --logs=/var/log/linstor-satellite --config
522206 523278 521614 521614 ? -1 S 0 0:00 | \_ drbdsetup events2 all
522206 2194352 521614 521614 ? -1 R 0 0:05 | \_ zfs list -H -p -o name,used,refer,volsize,type -t volume,snapshot
520310 522064 522064 522064 ? -1 Ssl 0 0:01 \_ /usr/sbin/drbd-reactor
522064 522081 522064 522064 ? -1 S 0 0:00 \_ drbdsetup events2 --full --poll
522064 770682 770678 770271 ? -1 Z 0 0:00 \_ [drbdsetup] <defunct>
522064 770786 770782 770271 ? -1 Z 0 0:00 \_ [drbdsetup] <defunct>
With the above example the volume attach appears to have failed indefinitely as it's been over 3 hours. Sometimes the volumes will attach after roughly 45 minutes.
Here are some further logs from kube-controller-manager
with --v=4
I0822 13:04:24.426940 1 reconciler.go:325] "attacherDetacher.AttachVolume started" volume={VolumeToAttach:{MultiAttachErrorReported:false VolumeName:kubernetes.io/csi/linstor.csi.linbit.com^pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483 VolumeSpec:0xc0052f7650 NodeName:vm9-node2.23-106-61-193.lon-01.uk ScheduledPods:[&Pod{ObjectMeta:{nfs-taiga-staticdata-nfs-server-provisioner-0 nfs-taiga-staticdata-nfs-server-provisioner- team-100 bcccf001-6205-451e-89f8-bead2761be28 67233800 0 2022-08-19 15:48:51 +0000 UTC <nil> <nil> map[app:nfs-server-provisioner chart:nfs-server-provisioner-1.4.0 controller-revision-hash:nfs-taiga-staticdata-nfs-server-provisioner-5c75756788 heritage:Helm release:nfs-taiga-staticdata statefulset.kubernetes.io/pod-name:nfs-taiga-staticdata-nfs-server-provisioner-0] map[] [{apps/v1 StatefulSet nfs-taiga-staticdata-nfs-server-provisioner 554741f9-b120-46f2-8905-ebdd2bdb5607 0xc004267447 0xc004267448}] [] [{kube-controller-manager Update v1 2022-08-19 15:48:51 +0000 UTC FieldsV1 {"f:metadata":{"f:generateName":{},"f:labels":{".":{},"f:app":{},"f:chart":{},"f:controller-revision-hash":{},"f:heritage":{},"f:release":{},"f:statefulset.kubernetes.io/pod-name":{}},"f:ownerReferences":{".":{},"k:{\"uid\":\"554741f9-b120-46f2-8905-ebdd2bdb5607\"}":{}}},"f:spec":{"f:affinity":{".":{},"f:nodeAffinity":{".":{},"f:requiredDuringSchedulingIgnoredDuringExecution":{}},"f:podAntiAffinity":{".":{},"f:requiredDuringSchedulingIgnoredDuringExecution":{}}},"f:containers":{"k:{\"name\":\"nfs-server-provisioner\"}":{".":{},"f:args":{},"f:env":{".":{},"k:{\"name\":\"POD_IP\"}":{".":{},"f:name":{},"f:valueFrom":{".":{},"f:fieldRef":{}}},"k:{\"name\":\"POD_NAMESPACE\"}":{".":{},"f:name":{},"f:valueFrom":{".":{},"f:fieldRef":{}}},"k:{\"name\":\"SERVICE_NAME\"}":{".":{},"f:name":{},"f:value":{}}},"f:image":{},"f:imagePullPolicy":{},"f:name":{},"f:ports":{".":{},"k:{\"containerPort\":111,\"protocol\":\"TCP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}},"k:{\"containerPort\":111,\"protocol\":\"UDP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}},"k:{\"containerPort\":662,\"protocol\":\"TCP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}},"k:{\"containerPort\":662,\"protocol\":\"UDP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}},"k:{\"containerPort\":875,\"protocol\":\"TCP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}},"k:{\"containerPort\":875,\"protocol\":\"UDP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}},"k:{\"containerPort\":2049,\"protocol\":\"TCP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}},"k:{\"containerPort\":2049,\"protocol\":\"UDP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}},"k:{\"containerPort\":20048,\"protocol\":\"TCP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}},"k:{\"containerPort\":20048,\"protocol\":\"UDP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}},"k:{\"containerPort\":32803,\"protocol\":\"TCP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}},"k:{\"containerPort\":32803,\"protocol\":\"UDP\"}":{".":{},"f:containerPort":{},"f:name":{},"f:protocol":{}}},"f:resources":{},"f:securityContext":{".":{},"f:capabilities":{".":{},"f:add":{}}},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{},"f:volumeMounts":{".":{},"k:{\"mountPath\":\"/export\"}":{".":{},"f:mountPath":{},"f:name":{}}}}},"f:dnsPolicy":{},"f:enableServiceLinks":{},"f:hostname":{},"f:restartPolicy":{},"f:schedulerName":{},"f:securityContext":{},"f:serviceAccount":{},"f:serviceAccountName":{},"f:subdomain":{},"f:terminationGracePeriodSeconds":{},"f:volumes":{".":{},"k:{\"name\":\"data\"}":{".":{},"f:name":{},"f:persistentVolumeClaim":{".":{},"f:claimName":{}}}}}} } {kubelet Update v1 2022-08-19 15:49:13 +0000 UTC FieldsV1 {"f:status":{"f:conditions":{"k:{\"type\":\"ContainersReady\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:message":{},"f:reason":{},"f:status":{},"f:type":{}},"k:{\"type\":\"Initialized\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:status":{},"f:type":{}},"k:{\"type\":\"Ready\"}":{".":{},"f:lastProbeTime":{},"f:lastTransitionTime":{},"f:message":{},"f:reason":{},"f:status":{},"f:type":{}}},"f:containerStatuses":{},"f:hostIP":{},"f:startTime":{}}} status}]},Spec:PodSpec{Volumes:[]Volume{Volume{Name:data,VolumeSource:VolumeSource{HostPath:nil,EmptyDir:nil,GCEPersistentDisk:nil,AWSElasticBlockStore:nil,GitRepo:nil,Secret:nil,NFS:nil,ISCSI:nil,Glusterfs:nil,PersistentVolumeClaim:&PersistentVolumeClaimVolumeSource{ClaimName:data-nfs-taiga-staticdata-nfs-server-provisioner-0,ReadOnly:false,},RBD:nil,FlexVolume:nil,Cinder:nil,CephFS:nil,Flocker:nil,DownwardAPI:nil,FC:nil,AzureFile:nil,ConfigMap:nil,VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Projected:nil,StorageOS:nil,CSI:nil,Ephemeral:nil,},},Volume{Name:kube-api-access-d79lc,VolumeSource:VolumeSource{HostPath:nil,EmptyDir:nil,GCEPersistentDisk:nil,AWSElasticBlockStore:nil,GitRepo:nil,Secret:nil,NFS:nil,ISCSI:nil,Glusterfs:nil,PersistentVolumeClaim:nil,RBD:nil,FlexVolume:nil,Cinder:nil,CephFS:nil,Flocker:nil,DownwardAPI:nil,FC:nil,AzureFile:nil,ConfigMap:nil,VsphereVolume:nil,Quobyte:nil,AzureDisk:nil,PhotonPersistentDisk:nil,PortworxVolume:nil,ScaleIO:nil,Projected:&ProjectedVolumeSource{Sources:[]VolumeProjection{VolumeProjection{Secret:nil,DownwardAPI:nil,ConfigMap:nil,ServiceAccountToken:&ServiceAccountTokenProjection{Audience:,ExpirationSeconds:*3607,Path:token,},},VolumeProjection{Secret:nil,DownwardAPI:nil,ConfigMap:&ConfigMapProjection{LocalObjectReference:LocalObjectReference{Name:kube-root-ca.crt,},Items:[]KeyToPath{KeyToPath{Key:ca.crt,Path:ca.crt,Mode:nil,},},Optional:nil,},ServiceAccountToken:nil,},VolumeProjection{Secret:nil,DownwardAPI:&DownwardAPIProjection{Items:[]DownwardAPIVolumeFile{DownwardAPIVolumeFile{Path:namespace,FieldRef:&ObjectFieldSelector{APIVersion:v1,FieldPath:metadata.namespace,},ResourceFieldRef:nil,Mode:nil,},},},ConfigMap:nil,ServiceAccountToken:nil,},},DefaultMode:*420,},StorageOS:nil,CSI:nil,Ephemeral:nil,},},},Containers:[]Container{Container{Name:nfs-server-provisioner,Image:k8s.gcr.io/sig-storage/nfs-provisioner:v3.0.0,Command:[],Args:[-provisioner=cluster.local/nfs-taiga-staticdata-nfs-server-provisioner -device-based-fsids=false],WorkingDir:,Ports:[]ContainerPort{ContainerPort{Name:nfs,HostPort:0,ContainerPort:2049,Protocol:TCP,HostIP:,},ContainerPort{Name:nfs-udp,HostPort:0,ContainerPort:2049,Protocol:UDP,HostIP:,},ContainerPort{Name:nlockmgr,HostPort:0,ContainerPort:32803,Protocol:TCP,HostIP:,},ContainerPort{Name:nlockmgr-udp,HostPort:0,ContainerPort:32803,Protocol:UDP,HostIP:,},ContainerPort{Name:mountd,HostPort:0,ContainerPort:20048,Protocol:TCP,HostIP:,},ContainerPort{Name:mountd-udp,HostPort:0,ContainerPort:20048,Protocol:UDP,HostIP:,},ContainerPort{Name:rquotad,HostPort:0,ContainerPort:875,Protocol:TCP,HostIP:,},ContainerPort{Name:rquotad-udp,HostPort:0,ContainerPort:875,Protocol:UDP,HostIP:,},ContainerPort{Name:rpcbind,HostPort:0,ContainerPort:111,Protocol:TCP,HostIP:,},ContainerPort{Name:rpcbind-udp,HostPort:0,ContainerPort:111,Protocol:UDP,HostIP:,},ContainerPort{Name:statd,HostPort:0,ContainerPort:662,Protocol:TCP,HostIP:,},ContainerPort{Name:statd-udp,HostPort:0,ContainerPort:662,Protocol:UDP,HostIP:,},},Env:[]EnvVar{EnvVar{Name:POD_IP,Value:,ValueFrom:&EnvVarSource{FieldRef:&ObjectFieldSelector{APIVersion:v1,FieldPath:status.podIP,},ResourceFieldRef:nil,ConfigMapKeyRef:nil,SecretKeyRef:nil,},},EnvVar{Name:SERVICE_NAME,Value:nfs-taiga-staticdata-nfs-server-provisioner,ValueFrom:nil,},EnvVar{Name:POD_NAMESPACE,Value:,ValueFrom:&EnvVarSource{FieldRef:&ObjectFieldSelector{APIVersion:v1,FieldPath:metadata.namespace,},ResourceFieldRef:nil,ConfigMapKeyRef:nil,SecretKeyRef:nil,},},},Resources:ResourceRequirements{Limits:ResourceList{},Requests:ResourceList{},},VolumeMounts:[]VolumeMount{VolumeMount{Name:data,ReadOnly:false,MountPath:/export,SubPath:,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:kube-api-access-d79lc,ReadOnly:true,MountPath:/var/run/secrets/kubernetes.io/serviceaccount,SubPath:,MountPropagation:nil,SubPathExpr:,},},LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:nil,TerminationMessagePath:/dev/termination-log,ImagePullPolicy:IfNotPresent,SecurityContext:&SecurityContext{Capabilities:&Capabilities{Add:[DAC_READ_SEARCH SYS_RESOURCE],Drop:[],},Privileged:nil,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,RunAsGroup:nil,ProcMount:nil,WindowsOptions:nil,SeccompProfile:nil,},Stdin:false,StdinOnce:false,TTY:false,EnvFrom:[]EnvFromSource{},TerminationMessagePolicy:File,VolumeDevices:[]VolumeDevice{},StartupProbe:nil,},},RestartPolicy:Always,TerminationGracePeriodSeconds:*100,ActiveDeadlineSeconds:nil,DNSPolicy:ClusterFirst,NodeSelector:map[string]string{},ServiceAccountName:nfs-taiga-staticdata-nfs-server-provisioner,DeprecatedServiceAccount:nfs-taiga-staticdata-nfs-server-provisioner,NodeName:vm9-node2.23-106-61-193.lon-01.uk,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:&PodSecurityContext{SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,SupplementalGroups:[],FSGroup:nil,RunAsGroup:nil,Sysctls:[]Sysctl{},WindowsOptions:nil,FSGroupChangePolicy:nil,SeccompProfile:nil,},ImagePullSecrets:[]LocalObjectReference{},Hostname:nfs-taiga-staticdata-nfs-server-provisioner-0,Subdomain:nfs-taiga-staticdata-nfs-server-provisioner,Affinity:&Affinity{NodeAffinity:&NodeAffinity{RequiredDuringSchedulingIgnoredDuringExecution:&NodeSelector{NodeSelectorTerms:[]NodeSelectorTerm{NodeSelectorTerm{MatchExpressions:[]NodeSelectorRequirement{NodeSelectorRequirement{Key:kubernetes.io/hostname,Operator:In,Values:[dedi1-node1.23-106-60-155.lon-01.uk vm9-node2.23-106-61-193.lon-01.uk],},},MatchFields:[]NodeSelectorRequirement{},},},},PreferredDuringSchedulingIgnoredDuringExecution:[]PreferredSchedulingTerm{},},PodAffinity:nil,PodAntiAffinity:&PodAntiAffinity{RequiredDuringSchedulingIgnoredDuringExecution:[]PodAffinityTerm{PodAffinityTerm{LabelSelector:&v1.LabelSelector{MatchLabels:map[string]string{},MatchExpressions:[]LabelSelectorRequirement{LabelSelectorRequirement{Key:release,Operator:In,Values:[taiga-staticdata],},},},Namespaces:[],TopologyKey:kubernetes.io/hostname,NamespaceSelector:nil,},},PreferredDuringSchedulingIgnoredDuringExecution:[]WeightedPodAffinityTerm{},},},SchedulerName:default-scheduler,InitContainers:[]Container{},AutomountServiceAccountToken:nil,Tolerations:[]Toleration{Toleration{Key:node.kubernetes.io/not-ready,Operator:Exists,Value:,Effect:NoExecute,TolerationSeconds:*20,},Toleration{Key:node.kubernetes.io/unreachable,Operator:Exists,Value:,Effect:NoExecute,TolerationSeconds:*20,},},HostAliases:[]HostAlias{},PriorityClassName:,Priority:*0,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[]PodReadinessGate{},RuntimeClassName:nil,EnableServiceLinks:*true,PreemptionPolicy:*PreemptLowerPriority,Overhead:ResourceList{},TopologySpreadConstraints:[]TopologySpreadConstraint{},EphemeralContainers:[]EphemeralContainer{},SetHostnameAsFQDN:nil,OS:nil,},Status:PodStatus{Phase:Pending,Conditions:[]PodCondition{PodCondition{Type:Initialized,Status:True,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2022-08-19 15:48:51 +0000 UTC,Reason:,Message:,},PodCondition{Type:Ready,Status:False,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2022-08-19 15:48:51 +0000 UTC,Reason:ContainersNotReady,Message:containers with unready status: [nfs-server-provisioner],},PodCondition{Type:ContainersReady,Status:False,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2022-08-19 15:48:51 +0000 UTC,Reason:ContainersNotReady,Message:containers with unready status: [nfs-server-provisioner],},PodCondition{Type:PodScheduled,Status:True,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2022-08-19 15:48:51 +0000 UTC,Reason:,Message:,},},Message:,Reason:,HostIP:23.106.61.193,PodIP:,StartTime:2022-08-19 15:48:51 +0000 UTC,ContainerStatuses:[]ContainerStatus{ContainerStatus{Name:nfs-server-provisioner,State:ContainerState{Waiting:&ContainerStateWaiting{Reason:ContainerCreating,Message:,},Running:nil,Terminated:nil,},LastTerminationState:ContainerState{Waiting:nil,Running:nil,Terminated:nil,},Ready:false,RestartCount:0,Image:k8s.gcr.io/sig-storage/nfs-provisioner:v3.0.0,ImageID:,ContainerID:,Started:*false,},},QOSClass:BestEffort,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{},EphemeralContainerStatuses:[]ContainerStatus{},},}]}}
E0822 13:04:24.958001 1 csi_attacher.go:722] kubernetes.io/csi: attachment for pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483 failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0822 13:04:24.958051 1 actual_state_of_world.go:357] Volume "kubernetes.io/csi/linstor.csi.linbit.com^pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483" is already added to attachedVolume list to node "vm9-node2.23-106-61-193.lon-01.uk", update device path ""
E0822 13:04:24.958155 1 nestedpendingoperations.go:335] Operation for "{volumeName:kubernetes.io/csi/linstor.csi.linbit.com^pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483 podName: nodeName:vm9-node2.23-106-61-193.lon-01.uk}" failed. No retries permitted until 2022-08-22 13:06:26.958119222 +0000 UTC m=+575.556584361 (durationBeforeRetry 2m2s). Error: AttachVolume.Attach failed for volume "pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483" (UniqueName: "kubernetes.io/csi/linstor.csi.linbit.com^pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483") from node "vm9-node2.23-106-61-193.lon-01.uk" : rpc error: code = DeadlineExceeded desc = context deadline exceeded
I0822 13:04:24.958190 1 event.go:294] "Event occurred" object="team-100/nfs-taiga-staticdata-nfs-server-provisioner-0" fieldPath="" kind="Pod" apiVersion="v1" type="Warning" reason="FailedAttachVolume" message="AttachVolume.Attach failed for volume \"pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483\" : rpc error: code = DeadlineExceeded desc = context deadline exceeded"
I0822 13:04:28.511467 1 actual_state_of_world.go:400] SetVolumeMountedByNode volume kubernetes.io/csi/linstor.csi.linbit.com^pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483 to the node "vm9-node2.23-106-61-193.lon-01.uk" mounted true
I0822 13:04:39.188610 1 pv_controller.go:254] synchronizing PersistentVolumeClaim[team-100/data-nfs-taiga-staticdata-nfs-server-provisioner-0]: phase: Bound, bound to: "pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483", bindCompleted: true, boundByController: true
I0822 13:04:39.188657 1 pv_controller.go:502] synchronizing bound PersistentVolumeClaim[team-100/data-nfs-taiga-staticdata-nfs-server-provisioner-0]: volume "pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483" found: phase: Bound, bound to: "team-100/data-nfs-taiga-staticdata-nfs-server-provisioner-0 (uid: 7e1063f2-b831-41a6-8c85-fee1fbf6c483)", boundByController: false
I0822 13:04:39.188669 1 pv_controller.go:519] synchronizing bound PersistentVolumeClaim[team-100/data-nfs-taiga-staticdata-nfs-server-provisioner-0]: claim is already correctly bound
I0822 13:04:39.188678 1 pv_controller.go:1023] binding volume "pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483" to claim "team-100/data-nfs-taiga-staticdata-nfs-server-provisioner-0"
I0822 13:04:39.188689 1 pv_controller.go:921] updating PersistentVolume[pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483]: binding to "team-100/data-nfs-taiga-staticdata-nfs-server-provisioner-0"
I0822 13:04:39.188723 1 pv_controller.go:933] updating PersistentVolume[pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483]: already bound to "team-100/data-nfs-taiga-staticdata-nfs-server-provisioner-0"
I0822 13:04:39.188738 1 pv_controller.go:869] updating PersistentVolume[pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483]: set phase Bound
I0822 13:04:39.188753 1 pv_controller.go:872] updating PersistentVolume[pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483]: phase Bound already set
I0822 13:04:39.188773 1 pv_controller.go:961] updating PersistentVolumeClaim[team-100/data-nfs-taiga-staticdata-nfs-server-provisioner-0]: binding to "pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483"
I0822 13:04:39.188802 1 pv_controller.go:1008] updating PersistentVolumeClaim[team-100/data-nfs-taiga-staticdata-nfs-server-provisioner-0]: already bound to "pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483"
I0822 13:04:39.188834 1 pv_controller.go:762] updating PersistentVolumeClaim[team-100/data-nfs-taiga-staticdata-nfs-server-provisioner-0] status: set phase Bound
I0822 13:04:39.188854 1 pv_controller.go:820] updating PersistentVolumeClaim[team-100/data-nfs-taiga-staticdata-nfs-server-provisioner-0] status: phase Bound already set
I0822 13:04:39.188874 1 pv_controller.go:1049] volume "pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483" bound to claim "team-100/data-nfs-taiga-staticdata-nfs-server-provisioner-0"
I0822 13:04:39.188908 1 pv_controller.go:1050] volume "pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483" status after binding: phase: Bound, bound to: "team-100/data-nfs-taiga-staticdata-nfs-server-provisioner-0 (uid: 7e1063f2-b831-41a6-8c85-fee1fbf6c483)", boundByController: false
I0822 13:04:39.188944 1 pv_controller.go:1051] claim "team-100/data-nfs-taiga-staticdata-nfs-server-provisioner-0" status after binding: phase: Bound, bound to: "pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483", bindCompleted: true, boundByController: true
The only thing which really stands out is:
I0822 13:04:24.958051 1 actual_state_of_world.go:357] Volume "kubernetes.io/csi/linstor.csi.linbit.com^pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483" is already added to attachedVolume list to node "vm9-node2.23-106-61-193.lon-01.uk", update device path ""
I'm not sure if this is causing or caused by the timeout issue, the parent function for this is https://github.com/kubernetes/kubernetes/blob/95ee5ab382d64cfe6c28967f36b53970b8374491/pkg/kubelet/volumemanager/cache/actual_state_of_world.go#L338
I'm unsure why the device path would be empty?
I was looking for similar issues with volume attachment with CSI plugins, https://github.com/longhorn/longhorn/issues/2322 sounds similar but these volumes contain hardly any files so it cannot be a delay due to ownership changes.
I will continue trying to dig into this.
Hi @Rid , just a quick update: thanks for the detailed logs, with most devs on well deserved vacation it will take a bit till somebody looks into this, but they will eventually
dedi1-node1.23-106-60-155.lon-01.uk ┊ SATELLITE ┊ 23.106.60.155:3367 (SSL) ┊ EVICTED
I wonder if this has something to do with auto-eviction?
I0819 16:05:29.747960 1 csi_handler.go:234] Error processing "csi-d9e810a254b0ffa73804d0480a395b9e575625dd0482256f9213db53b5fc908b": failed to attach: rpc error: code = Internal desc = ControllerPublishVolume failed for pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483: failed to check existing assignment: context deadline exceeded
That looks like the most promising lead to me. The ControllerPublishVolume call has a timeout of 1 minute AFAIK. So I am wondering why it exceeds that timeout. I think it only really checks that the volume is available on whatever node was requested, so there shouldn't really be any potential for a time out. Can you run with the chart value set to csi.logLevel=debug
and check the logs of the csi-controller pod, specifically the linstor-csi-plugin container for more information?
dedi1-node1.23-106-60-155.lon-01.uk ┊ SATELLITE ┊ 23.106.60.155:3367 (SSL) ┊ EVICTED
I wonder if this has something to do with auto-eviction?
The node dedi1-node1.23-106-60-155.lon-01.uk
is properly evicted as we're testing fail-over after a node outage. We've left the node offline to try and debug the issue (where PVs are timing out during volume attachment).
I0819 16:05:29.747960 1 csi_handler.go:234] Error processing "csi-d9e810a254b0ffa73804d0480a395b9e575625dd0482256f9213db53b5fc908b": failed to attach: rpc error: code = Internal desc = ControllerPublishVolume failed for pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483: failed to check existing assignment: context deadline exceeded
That looks like the most promising lead to me. The ControllerPublishVolume call has a timeout of 1 minute AFAIK. So I am wondering why it exceeds that timeout. I think it only really checks that the volume is available on whatever node was requested, so there shouldn't really be any potential for a time out. Can you run with the chart value set to
csi.logLevel=debug
and check the logs of the csi-controller pod, specifically the linstor-csi-plugin container for more information?
I'll increase the log value today, can you possibly link me to the controller code which does the check on the linstor side?
can you possibly link me to the controller code which does the check on the linstor side?
That's the entry on the csi plugin side: https://github.com/piraeusdatastore/linstor-csi/blob/master/pkg/driver/driver.go#L613
And thats were the attach operation actually happens: https://github.com/piraeusdatastore/linstor-csi/blob/master/pkg/client/linstor.go#L430
I've recreated the pvs with the debug logs enabled, listing the resources is taking a long time:
# time linstor r l -a
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ ResourceName ┊ Node ┊ Port ┊ Usage ┊ Conns ┊ State ┊ CreatedOn ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ pvc-3eb665ec-27ec-4875-8df9-106751f1a832 ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7008 ┊ ┊ ┊ Unknown ┊ 2022-08-31 15:39:18 ┊
┊ pvc-3eb665ec-27ec-4875-8df9-106751f1a832 ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7008 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ TieBreaker ┊ 2022-08-31 15:39:41 ┊
┊ pvc-3eb665ec-27ec-4875-8df9-106751f1a832 ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7008 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ UpToDate ┊ 2022-08-31 15:40:27 ┊
┊ pvc-3ec4cd7c-a369-4fe8-b0c3-68528ad1ef37 ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7007 ┊ ┊ ┊ Unknown ┊ 2022-08-31 15:39:18 ┊
┊ pvc-3ec4cd7c-a369-4fe8-b0c3-68528ad1ef37 ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7007 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ TieBreaker ┊ 2022-08-31 15:39:39 ┊
┊ pvc-3ec4cd7c-a369-4fe8-b0c3-68528ad1ef37 ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7007 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ UpToDate ┊ 2022-08-31 15:39:56 ┊
┊ pvc-95f607ac-89fb-43d3-81a1-2ad89bdfa438 ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7004 ┊ ┊ ┊ Unknown ┊ 2022-08-31 15:38:34 ┊
┊ pvc-95f607ac-89fb-43d3-81a1-2ad89bdfa438 ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7004 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ TieBreaker ┊ 2022-08-31 15:38:53 ┊
┊ pvc-95f607ac-89fb-43d3-81a1-2ad89bdfa438 ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7004 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ UpToDate ┊ 2022-08-31 15:39:30 ┊
┊ pvc-8932ceaf-ebc6-4c7f-855f-b3c8da2a9564 ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7009 ┊ ┊ ┊ Unknown ┊ 2022-08-31 15:39:19 ┊
┊ pvc-8932ceaf-ebc6-4c7f-855f-b3c8da2a9564 ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7009 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ TieBreaker ┊ 2022-08-31 15:39:44 ┊
┊ pvc-8932ceaf-ebc6-4c7f-855f-b3c8da2a9564 ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7009 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ UpToDate ┊ 2022-08-31 15:40:27 ┊
┊ pvc-28069ce0-b3a0-4f46-94db-ae5f2eda5d4d ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7006 ┊ ┊ ┊ Unknown ┊ 2022-08-31 15:38:34 ┊
┊ pvc-28069ce0-b3a0-4f46-94db-ae5f2eda5d4d ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7006 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ TieBreaker ┊ 2022-08-31 15:38:59 ┊
┊ pvc-28069ce0-b3a0-4f46-94db-ae5f2eda5d4d ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7006 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ UpToDate ┊ 2022-08-31 15:39:30 ┊
┊ pvc-d2b15e29-2594-40b3-90ae-9a14c0fda0bc ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7003 ┊ ┊ ┊ Unknown ┊ 2022-08-31 15:38:34 ┊
┊ pvc-d2b15e29-2594-40b3-90ae-9a14c0fda0bc ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7003 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ TieBreaker ┊ 2022-08-31 15:38:50 ┊
┊ pvc-d2b15e29-2594-40b3-90ae-9a14c0fda0bc ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7003 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ UpToDate ┊ 2022-08-31 15:39:08 ┊
┊ pvc-fbce07ea-aa30-4093-acba-82661d040903 ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7010 ┊ ┊ ┊ Unknown ┊ 2022-08-31 15:39:19 ┊
┊ pvc-fbce07ea-aa30-4093-acba-82661d040903 ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7010 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ TieBreaker ┊ 2022-08-31 15:39:46 ┊
┊ pvc-fbce07ea-aa30-4093-acba-82661d040903 ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7010 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ UpToDate ┊ 2022-08-31 15:40:27 ┊
┊ pvc-fc12a115-ace3-49d2-8fb1-a3cb8b1a3b38 ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7005 ┊ ┊ ┊ Unknown ┊ 2022-08-31 15:38:34 ┊
┊ pvc-fc12a115-ace3-49d2-8fb1-a3cb8b1a3b38 ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7005 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ TieBreaker ┊ 2022-08-31 15:38:56 ┊
┊ pvc-fc12a115-ace3-49d2-8fb1-a3cb8b1a3b38 ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7005 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ UpToDate ┊ 2022-08-31 15:39:30 ┊
┊ pvc-ff369c1b-3abd-4e90-9481-7aeb376626e0 ┊ dedi1-node1.23-106-60-155.lon-01.uk ┊ 7000 ┊ ┊ ┊ Unknown ┊ 2022-08-31 15:37:33 ┊
┊ pvc-ff369c1b-3abd-4e90-9481-7aeb376626e0 ┊ vm6-cplane1.23-106-61-231.lon-01.uk ┊ 7000 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ TieBreaker ┊ 2022-08-31 15:37:50 ┊
┊ pvc-ff369c1b-3abd-4e90-9481-7aeb376626e0 ┊ vm9-node2.23-106-61-193.lon-01.uk ┊ 7000 ┊ Unused ┊ Connecting(dedi1-node1.23-106-60-155.lon-01.uk) ┊ UpToDate ┊ 2022-08-31 15:38:00 ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
________________________________________________________
Executed in 55.88 secs fish external
usr time 276.36 millis 0.00 millis 276.36 millis
sys time 61.65 millis 1.15 millis 60.51 millis
After the pv's were attached this went down to about 9 seconds.
All pods were stuck for 15m then created, we did a crictl prune and deleted a bunch of exited containers, this reduced the zfs datasets from over 1300 down to 658 reducing the amount of time it takes to do a zfs list by a second.
The system we're creating will have thousands of datasets though (as each system will have thousands of containers and we're running ZFS on root)
I've attached the logs of the linstor-csi-plugin from the time when the pvs were created. debug-logs.txt
Analysing the logs shows the initial ControllerPublishVolume failed
due to connection being refused when the node goes offline, we can then see it switches after a few minutes to:
time="2022-08-31T16:04:04Z" level=error msg="method failed" func=github.com/piraeusdatastore/linstor-csi/pkg/driver.Driver.Run.func1 file="/src/pkg/driver/driver.go:1220" error="rpc error: code = Internal desc = ControllerPublishVolume failed for pvc-ff369c1b-3abd-4e90-9481-7aeb376626e0: context deadline exceeded" linstorCSIComponent=driver method=/csi.v1.Controller/ControllerPublishVolume nodeID=vm7-cplane2.23-106-61-232.lon-01.uk.appsolo.com provisioner=linstor.csi.linbit.com req="volume_id:\"pvc-ff369c1b-3abd-4e90-9481-7aeb376626e0\" node_id:\"vm9-node2.23-106-61-193.lon-01.uk.appsolo.com\" volume_capability:<mount:<fs_type:\"xfs\" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > > volume_context:<key:\"linstor.csi.linbit.com/mount-options\" value:\"\" > volume_context:<key:\"linstor.csi.linbit.com/post-mount-xfs-opts\" value:\"\" > volume_context:<key:\"linstor.csi.linbit.com/remote-access-policy\" value:\"true\" > volume_context:<key:\"linstor.csi.linbit.com/uses-volume-context\" value:\"true\" > volume_context:<key:\"storage.kubernetes.io/csiProvisionerIdentity\" value:\"1661955201588-8081-linstor.csi.linbit.com\" > " resp="<nil>" version=v0.20.0-d514e41db7cdcb580769cc69f1c1ef2b8a5def5d
Unfortunately there doesn't appear to be any information about why this is timing out. However we can deduce that the timeout is happening here:
https://github.com/piraeusdatastore/linstor-csi/blob/4ee8dce775083abefa1d1b021ae864109cbbd339/pkg/driver/driver.go#L653
I believe the timeout may be occurring on this request:
https://github.com/piraeusdatastore/linstor-csi/blob/4ee8dce775083abefa1d1b021ae864109cbbd339/pkg/client/linstor.go#L436
As it takes 5 seconds to get a response:
# time curl -v -k -H 'Accept: application/json' 'https://piraeus-op-cs.default.svc:3371/v1/view/resources?limit=0&offset=0&resources=pvc-3eb665ec-27ec-4875-8df9-106751f1a832'
* Trying 10.74.207.153:3371...
* Connected to piraeus-op-cs.default.svc (10.74.207.153) port 3371 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
* CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server did not agree to a protocol
* Server certificate:
* subject: CN=piraeus-op-cs
* start date: Aug 31 12:46:10 2022 GMT
* expire date: Aug 28 12:46:10 2032 GMT
* issuer: CN=piraeus-op-ca
* SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
> GET /v1/view/resources?limit=0&offset=0&resources=pvc-3eb665ec-27ec-4875-8df9-106751f1a832 HTTP/1.1
> Host: piraeus-op-cs.default.svc:3371
> User-Agent: curl/7.74.0
> Accept: application/json
>
* Empty reply from server
* Connection #0 to host piraeus-op-cs.default.svc left intact
curl: (52) Empty reply from server
real 0m5.054s
user 0m0.020s
sys 0m0.020s
Strangely getting the resources and resource definitions via curl gives an empty reply, is this normal?
# curl -k -X GET https://piraeus-op-cs.default.svc:3371/v1/view/resources
curl: (52) Empty reply from server
Here the resources view times out after 23 secs:
time="2022-08-31T16:05:07Z" level=debug msg="curl -X 'GET' -H 'Accept: application/json' 'https://piraeus-op-cs.default.svc:3371/v1/view/resources?limit=0&offset=0&resources=pvc-3eb665ec-27ec-4875-8df9-106751f1a832'"
time="2022-08-31T16:05:30Z" level=error msg="method failed" func=github.com/piraeusdatastore/linstor-csi/pkg/driver.Driver.Run.func1 file="/src/pkg/driver/driver.go:1220" error="rpc error: code = Internal desc = ControllerPublishVolume failed for pvc-3eb665ec-27ec-4875-8df9-106751f1a832: context canceled" linstorCSIComponent=driver method=/csi.v1.Controller/ControllerPublishVolume nodeID=vm7-cplane2.23-106-61-232.lon-01.uk.appsolo.com provisioner=linstor.csi.linbit.com
Here on the same resources view it times out after just 1 second:
time="2022-08-31T16:06:29Z" level=debug msg="curl -X 'GET' -H 'Accept: application/json' 'https://piraeus-op-cs.default.svc:3371/v1/view/resources?limit=0&offset=0&resources=pvc-3eb665ec-27ec-4875-8df9-106751f1a832'"
time="2022-08-31T16:06:30Z" level=error msg="method failed" func=github.com/piraeusdatastore/linstor-csi/pkg/driver.Driver.Run.func1 file="/src/pkg/driver/driver.go:1220" error="rpc error: code = Internal desc = ControllerPublishVolume failed for pvc-3eb665ec-27ec-4875-8df9-106751f1a832: context deadline exceeded" linstorCSIComponent=driver method=/csi.v1.Controller/ControllerPublishVolume nodeID=vm7-cplane2.23-106-61-232.lon-01.uk.appsolo.com provisioner=linstor.csi.linbit.com req="volume_id:\"pvc-3eb665ec-27ec-4875-8df9-106751f1a832\" node_id:\"vm9-node2.23-106-61-193.lon-01.uk.appsolo.com\" volume_capability:<mount:<fs_type:\"xfs\" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > > volume_context:<key:\"linstor.csi.linbit.com/mount-options\" value:\"\" > volume_context:<key:\"linstor.csi.linbit.com/post-mount-xfs-opts\" value:\"\" > volume_context:<key:\"linstor.csi.linbit.com/remote-access-policy\" value:\"true\" > volume_context:<key:\"linstor.csi.linbit.com/uses-volume-context\" value:\"true\" > volume_context:<key:\"storage.kubernetes.io/csiProvisionerIdentity\" value:\"1661955201588-8081-linstor.csi.linbit.com\" > " resp="<nil>" version=v0.20.0-d514e41db7cdcb580769cc69f1c1ef2b8a5def5d
In the case below the resource definition request times out after 1 minute:
time="2022-08-31T16:06:31Z" level=debug msg="getting assignments" func="github.com/piraeusdatastore/linstor-csi/pkg/client.(*Linstor).Status" file="/src/pkg/client/linstor.go:1626" linstorCSIComponent=client volume=pvc-3eb665ec-27ec-4875-8df9-106751f1a832
time="2022-08-31T16:06:31Z" level=debug msg="curl -X 'GET' -H 'Accept: application/json' 'https://piraeus-op-cs.default.svc:3371/v1/resource-definitions/pvc-3eb665ec-27ec-4875-8df9-106751f1a832/resources'"
time="2022-08-31T16:07:30Z" level=error msg="method failed" func=github.com/piraeusdatastore/linstor-csi/pkg/driver.Driver.Run.func1 file="/src/pkg/driver/driver.go:1220" error="rpc error: code = Internal desc = ControllerPublishVolume failed for pvc-3eb665ec-27ec-4875-8df9-106751f1a832: failed to check existing assignment: context canceled" linstorCSIComponent=driver method=/csi.v1.Controller/ControllerPublishVolume nodeID=vm7-cplane2.23-106-61-232.lon-01.uk.appsolo.com provisioner=linstor.csi.linbit.com req="volume_id:\"pvc-3eb665ec-27ec-4875-8df9-106751f1a832\" node_id:\"vm9-node2.23-106-61-193.lon-01.uk.appsolo.com\" volume_capability:<mount:<fs_type:\"xfs\" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > > volume_context:<key:\"linstor.csi.linbit.com/mount-options\" value:\"\" > volume_context:<key:\"linstor.csi.linbit.com/post-mount-xfs-opts\" value:\"\" > volume_context:<key:\"linstor.csi.linbit.com/remote-access-policy\" value:\"true\" > volume_context:<key:\"linstor.csi.linbit.com/uses-volume-context\" value:\"true\" > volume_context:<key:\"storage.kubernetes.io/csiProvisionerIdentity\" value:\"1661955201588-8081-linstor.csi.linbit.com\" > " resp="<nil>" version=v0.20.0-d514e41db7cdcb580769cc69f1c1ef2b8a5def5d
Hmm, I'm not sure on the order of those log lines. It could be that there are multiple requests interleaved. We should probably start adding a request ID to the logs at some point.
The empty replies are probably because of the missing client certificates for mTLS.
I not sure why it would take LINSTOR that long to answer these requests. You said you have a lot of zfs datasets, but the actual ZVols for LINSTOR seem to be few in numbers. Maybe worth to ask on the linstor-server project directly.
And maybe we can move away from the "view" method in that case, not sure if that would help
The zfs list
command itself takes 1.67 secs. I do think it would be better to do a list on the datasets in use by linstor rather than everything, especially where there are many more datasets outside of linstor.
as an example, zfs list -r rpool/team-100
takes just 14ms.
The link https://gitlab.at.linbit.com/linstor/linstor-server is not working for me, did you mean https://github.com/LINBIT/linstor-server?
The link https://gitlab.at.linbit.com/linstor/linstor-server is not working for me, did you mean https://github.com/LINBIT/linstor-server?
Yes, sorry, used the internal link there...
I think we may be able to speed it up by implementing https://github.com/LINBIT/linstor-server/issues/309#issuecomment-1233971633 (I will test it).
However we should also be able to tolerate a delay of a few seconds from the server. As it's possible in some cases a user could have thousands of volumes and snapshots used by linstor which could produce the same issue even with the speed up.
Can you let me know where the timeout value is set on the http requests? I can try increasing the value to also check if that solves the issue.
The timeout is created by the CSI sidecars. In this case I think it's here: https://github.com/kubernetes-csi/external-attacher/ (--timeout
) which seems to default to 15s. I thought it was 1m, but that could have been another sidecar.
1 762125 762125 6213 ? -1 Sl 0 7:37 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id ed2cfaae4049056770ad09a6d58720a363aa6b89df66c56baed57ad3e6180877 -address /run/containerd/containerd.sock
762125 762271 762271 762271 ? -1 Ss 65535 0:00 \_ /pause
762125 803731 803731 803731 ? -1 Ssl 0 0:35 \_ /csi-attacher --v=5 --csi-address=/var/lib/csi/sockets/pluginproxy/./csi.sock --timeout=1m --leader-election=true --leader-election-namespace=default --worker-threads=30
762125 804114 804114 804114 ? -1 Ssl 0 0:36 \_ /livenessprobe --csi-address=/var/lib/csi/sockets/pluginproxy/./csi.sock
762125 815470 815470 815470 ? -1 Ssl 0 3:01 \_ /csi-provisioner --csi-address=/var/lib/csi/sockets/pluginproxy/./csi.sock --timeout=1m --default-fstype=ext4 --feature-gates=Topology=true --leader-election=true --leader-election-namespace=default --enable-capacity --extra-create-metadata --capacity-ownerref-level=2 --worker-threads=30
762125 816603 816603 816603 ? -1 Ssl 0 0:32 \_ /csi-snapshotter --timeout=1m --csi-address=/var/lib/csi/sockets/pluginproxy/./csi.sock --leader-election=true --leader-election-namespace=default --worker-threads=30
762125 817633 817633 817633 ? -1 Ssl 0 0:33 \_ /csi-resizer --v=5 --csi-address=/var/lib/csi/sockets/pluginproxy/./csi.sock --timeout=1m --handle-volume-inuse-error=false --leader-election=true --leader-election-namespace=default --workers=30
762125 818013 818013 818013 ? -1 Ssl 0 0:32 \_ /linstor-csi --csi-endpoint=unix:///var/lib/csi/sockets/pluginproxy/./csi.sock --node=vm9-node2.23-106-61-193.lon-01.uk.appsolo.com --linstor-endpoint=https://piraeus-op-cs.default.svc:3371 --log-level=debug
These are hardcoded here:
https://github.com/piraeusdatastore/piraeus-operator/blob/5959d817d0791641670ea639f4996c65edffc77a/pkg/controller/linstorcsidriver/linstorcsidriver_controller.go#L828-L910
It looks like the sidecar container is set to 1m, but the timeout is actually happening in the linstor-csi
container, it doesn't appear to have any timeout flag set.
So either linstor-csi
has a low ctx timeout, or the delay on those endpoints is not the root cause.
Out of curiosity, did you provision your volumes before adding the HA parameters to your storage class? (E.g. by deleting it and then re-creating with new parameters.) This is something I just ran into -- none of my pods were getting handled correctly by HA because they were provisioned before I had added the HA params. I had to manually adjust their ResourceGroup to add the DRBD options.
Out of curiosity, did you provision your volumes before adding the HA parameters to your storage class? (E.g. by deleting it and then re-creating with new parameters.) This is something I just ran into -- none of my pods were getting handled correctly by HA because they were provisioned before I had added the HA params. I had to manually adjust their ResourceGroup to add the DRBD options.
We don't have any HA parameters on the storage class, the latest version of https://github.com/piraeusdatastore/piraeus-ha-controller is opt-out rather than opt-in like the previous version.
We can see that the pods are getting rescheduled on to the correct node properly, they're just not being attached due to a context timeout.
@WanzenBug I tried to follow the calls up from https://github.com/piraeusdatastore/linstor-csi/blob/4ee8dce775083abefa1d1b021ae864109cbbd339/pkg/client/linstor.go#L430 to find where ctx
is set. It appears to be in the grpc
code somewhere but it's not easy to find without a stack trace.
As you mentioned before, it's difficult to know which requests are timing out when due to there being no identifiers on the API calls. So it might be worth adding some additional debugging.
Would you be able to run curl -k -X GET https://piraeus-op-cs.default.svc:3371/v1/view/resources
on a healthy system and let me know the output?
Sure. For completeness, I added some flags to curl to make it work with mTLS (the debug output in the logs does not handle that correctly), and also my deployment is named linstor-cs
instead of piraeus-op-cs
, but that will not influence the output. I've also cut out the DRBD hmac secret, just to be sure :wink:
kubectl exec deployment/linstor-cs-controller -- curl --cacert /etc/linstor/client/ca.crt --key /etc/linstor/client/tls.key --cert /etc/linstor/client/tls.crt https://linstor-cs.storage.svc:3371/v1/view/resources | jq '.[].layer_object.drbd.drbd_resource_definition.secret = ""'
Thanks @WanzenBug the command works with the mTLS certs included, although it took 10 seconds to complete (4 seconds on the subsequent cached requests).
Surely 4-10 seconds should not cause a timeout?
failed to check existing assignment: context canceled
Is this error always emitted by calling the cancel method on the ctx, or can it also be emitted from deadline exceeded/timeout?
I think that is only emitted when the ctx is cancelled. That probably happens somewhere in the grpc server, when the RPC call itself is canceled/times out