Grant Millar
Grant Millar
Here's an example of the drbd state when adding 6 pvcs at once: ``` pvc-210c18f6-9a43-4b94-8903-6d3767445044 role:Primary disk:UpToDate dedi1-node1.23-106-60-155.lon-01.uk role:Secondary peer-disk:UpToDate vm6-cplane1.23-106-61-231.lon-01.uk role:Secondary peer-disk:Diskless peer-client:yes pvc-23a1dba9-25ab-44c0-a077-9604144d097c role:Secondary disk:UpToDate dedi1-node1.23-106-60-155.lon-01.uk role:Secondary peer-disk:UpToDate...
It appears I got the causation wrong, it appears something else is causing the delay, and once whatever is causing the `AttachVolume.Attach` to stop timing out, the pvc is promoted...
Further info: ``` kubectl version | grep Server Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.1", GitCommit:"3ddd0f45aa91e2f30c70734b175631bec5b5825a", GitTreeState:"clean", BuildDate:"2022-05-24T12:18:48Z", GoVersion:"go1.18.2", Compiler:"gc", Platform:"linux/amd64"} ``` ``` grep Image: /root/k8s-config/manifests/piraeus.yaml storkImage: docker.io/openstorage/stork:2.8.2 schedulerImage: k8s.gcr.io/kube-scheduler-amd64 pluginImage: quay.io/piraeusdatastore/piraeus-csi:v0.20.0...
Here are some further logs from `kube-controller-manager` with `--v=4` ``` I0822 13:04:24.426940 1 reconciler.go:325] "attacherDetacher.AttachVolume started" volume={VolumeToAttach:{MultiAttachErrorReported:false VolumeName:kubernetes.io/csi/linstor.csi.linbit.com^pvc-7e1063f2-b831-41a6-8c85-fee1fbf6c483 VolumeSpec:0xc0052f7650 NodeName:vm9-node2.23-106-61-193.lon-01.uk ScheduledPods:[&Pod{ObjectMeta:{nfs-taiga-staticdata-nfs-server-provisioner-0 nfs-taiga-staticdata-nfs-server-provisioner- team-100 bcccf001-6205-451e-89f8-bead2761be28 67233800 0 2022-08-19 15:48:51 +0000 UTC...
> > dedi1-node1.23-106-60-155.lon-01.uk ┊ SATELLITE ┊ 23.106.60.155:3367 (SSL) ┊ EVICTED > > I wonder if this has something to do with [auto-eviction](https://linbit.com/drbd-user-guide/linstor-guide-1_0-en/#s-linstor-auto-evict)? The node `dedi1-node1.23-106-60-155.lon-01.uk` is properly evicted as we're...
I've recreated the pvs with the debug logs enabled, listing the resources is taking a long time: ``` # time linstor r l -a ╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ ┊ ResourceName ┊ Node ┊...
The `zfs list` command itself takes 1.67 secs. I do think it would be better to do a list on the datasets in use by linstor rather than everything, especially...
I think we may be able to speed it up by implementing https://github.com/LINBIT/linstor-server/issues/309#issuecomment-1233971633 (I will test it). However we should also be able to tolerate a delay of a few...
``` 1 762125 762125 6213 ? -1 Sl 0 7:37 /usr/bin/containerd-shim-runc-v2 -namespace k8s.io -id ed2cfaae4049056770ad09a6d58720a363aa6b89df66c56baed57ad3e6180877 -address /run/containerd/containerd.sock 762125 762271 762271 762271 ? -1 Ss 65535 0:00 \_ /pause 762125 803731...
> Out of curiosity, did you provision your volumes before adding the HA parameters to your storage class? (E.g. by deleting it and then re-creating with new parameters.) This is...