mayastor
mayastor copied to clipboard
mayastor-csi node-driver-registrar Registration process failed on aws eks
Hello. When I follow instructions for mayastor on aws cluster (EKS) the mayastor-csi node-driver-registrar always fails with the following log: I0608 14:48:02.757010 1 main.go:113] Version: v2.1.0-0-g80d42f24 I0608 14:48:02.757918 1 connection.go:153] Connecting to unix:///csi/csi.sock I0608 14:48:02.852446 1 node_register.go:52] Starting Registration Server at: /registration/io.openebs.csi-mayastor-reg.sock I0608 14:48:02.852593 1 node_register.go:61] Registration Server started at: /registration/io.openebs.csi-mayastor-reg.sock I0608 14:48:02.852650 1 node_register.go:83] Skipping healthz server because HTTP endpoint is set to: "" I0608 14:48:04.226354 1 main.go:80] Received GetInfo call: &InfoRequest{} I0608 14:48:04.589035 1 main.go:90] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:false,Error:RegisterPlugin error -- plugin registration failed with err: error updating Node object with CSI driver node info: error updating node: timed out waiting for the condition; caused by: detected topology value collision: driver reported "kubernetes.io/hostname":"ip-X-X-X-X" but existing label is "kubernetes.io/hostname":"ip-X-X-X-X.us-east-1.compute.internal",} E0608 14:48:04.589073 1 main.go:92] Registration process failed with error: RegisterPlugin error -- plugin registration failed with err: error updating Node object with CSI driver node info: error updating node: timed out waiting for the condition; caused by: detected topology value collision: driver reported "kubernetes.io/hostname":"ip-X-X-X-X" but existing label is "kubernetes.io/hostname":"ip-X-X-X-X.us-east-1.compute.internal", restarting registration container.
Hi, we have made a fix for this and the change is currently on release/1.0.2. Can you please try with it?.
You can use the mayastor-daemonset
yaml from the deploy folder.
FYI: You would need to pass the hostname
as nodename
while pool creation as with the - "--node-name=$(MY_NODE_NAME)"
flag removed the mayastor
registers itself with the hostname
.
The official release of version v1.0.2, accompanied by the availability of images provided by the project maintainers, is expected within the next 30 days.
I have a similar issue with the error from csi pods:
mayastor-csi-6pg5v 1/2 CrashLoopBackOff 6 (2m51s ago) 8m48s
mayastor-csi-gfp4r 1/2 CrashLoopBackOff 6 (3m1s ago) 8m47s
mayastor-csi-h4ttr 1/2 CrashLoopBackOff 6 (2m37s ago) 8m47s
logs from csi-driver-registrar
container:
I0704 02:01:58.054673 1 main.go:113] Version: v2.1.0-0-g80d42f24
I0704 02:01:58.055231 1 connection.go:153] Connecting to unix:///csi/csi.sock
I0704 02:01:58.056390 1 node_register.go:52] Starting Registration Server at: /registration/io.openebs.csi-mayastor-reg.sock
I0704 02:01:58.056508 1 node_register.go:61] Registration Server started at: /registration/io.openebs.csi-mayastor-reg.sock
I0704 02:01:58.056594 1 node_register.go:83] Skipping healthz server because HTTP endpoint is set to: ""
I0704 02:01:59.463465 1 main.go:80] Received GetInfo call: &InfoRequest{}
I0704 02:01:59.822520 1 main.go:90] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:false,Error:RegisterPlugin error -- plugin registration failed with err: error updating Node object with CSI driver node info: error updating node: timed out waiting for the condition; caused by: detected topology value collision: driver reported "kubernetes.io/hostname":"95" but existing label is "kubernetes.io/hostname":"95.xxx.xx.xxx",}
E0703 18:09:13.292330 1 main.go:92] Registration process failed with error: RegisterPlugin error -- plugin registration failed with err: error updating Node object with CSI driver node info: error updating node: timed out waiting for the condition; caused by: detected topology value collision: driver reported "kubernetes.io/hostname":"95" but existing label is "kubernetes.io/hostname":"95.xxx.xx.xxx", restarting registration container.
the 95.xxx.xx.xxx is the hostname of that node.
where 95.xxx.xx.xxx
is the hostname of that node: kubernetes.io/hostname=95.xxx.xx.xxx
It seems somehow the CSI driver reported the hostname as only the first portion of the full hostname (95
).
Then I tried manifests of release/1.0.2
, for both mayastor and mayastor-control-plane but got the same error.
kubernetes version: 1.23
rke2
I'm having the same issue as @tz-torchai above. I'm following the instructions available at https://mayastor.gitbook.io/introduction/quickstart/deploy-mayastor and installed csi-daemonset with the following command.
kubectl apply -f https://raw.githubusercontent.com/openebs/mayastor/master/deploy/csi-daemonset.yaml
> kubectl -n mayastor get pods -w
NAME READY STATUS RESTARTS AGE
mayastor-csi-2pxr2 1/2 CrashLoopBackOff 6 (2m21s ago) 12m
mayastor-csi-gklm8 1/2 CrashLoopBackOff 6 (2m21s ago) 12m
mayastor-csi-mgtt5 1/2 CrashLoopBackOff 6 (4m34s ago) 12m
mayastor-csi-sxhtn 1/2 CrashLoopBackOff 6 (3m30s ago) 12m
mayastor-etcd-0 1/1 Running 0 13m
mayastor-etcd-1 1/1 Running 0 13m
mayastor-etcd-2 1/1 Running 0 13m
nats-0 2/2 Running 0 15m
nats-1 2/2 Running 0 14m
nats-2 2/2 Running 0 14m
> kubectl -n mayastor logs mayastor-csi-2pxr2 -c csi-driver-registrar
I0816 20:34:51.285641 1 main.go:113] Version: v2.1.0-0-g80d42f24
I0816 20:34:51.286084 1 connection.go:153] Connecting to unix:///csi/csi.sock
I0816 20:34:51.287162 1 node_register.go:52] Starting Registration Server at: /registration/io.openebs.csi-mayastor-reg.sock
I0816 20:34:51.287276 1 node_register.go:61] Registration Server started at: /registration/io.openebs.csi-mayastor-reg.sock
I0816 20:34:51.287297 1 node_register.go:83] Skipping healthz server because HTTP endpoint is set to: ""
I0816 20:34:52.735125 1 main.go:80] Received GetInfo call: &InfoRequest{}
I0816 20:34:53.120028 1 main.go:90] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:false,Error:RegisterPlugin error -- plugin registration failed with err: error updating Node object with CSI driver node info: error updating node: timed out waiting for the condition; caused by: detected topology value collision: driver reported "kubernetes.io/hostname":"10" but existing label is "kubernetes.io/hostname":"10.41.3.92",}
E0816 20:34:53.120079 1 main.go:92] Registration process failed with error: RegisterPlugin error -- plugin registration failed with err: error updating Node object with CSI driver node info: error updating node: timed out waiting for the condition; caused by: detected topology value collision: driver reported "kubernetes.io/hostname":"10" but existing label is "kubernetes.io/hostname":"10.41.3.92", restarting registration container.
This is broken on 1.0.2 AFAICT, we strip out the subdomain, which doesn't really work when your hostname is an ip address. We've made changes to the way we handle node-names on the latest code-base but it hasn't been released yet.
You can have a peek at the develop
by using this helmchart, but please be aware this is not compatible with the latest release, and will likely incur breaking changes until it reaches a new release, so not advised for production.
I've made a sneaky test-image compatible with the 1.0.2 if you want to quickly see if not splitting the hostname works, though it might likely fail elsewhere, as I've not tested this at all, use at your peril :) mayadata/mayastor-csi:2a4f05e0b37b
This is broken on 1.0.2 AFAICT, we strip out the subdomain, which doesn't really work when your hostname is an ip address. We've made changes to the way we handle node-names on the latest code-base but it hasn't been released yet.
You can have a peek at the
develop
by using this helmchart, but please be aware this is not compatible with the latest release, and will likely incur breaking changes until it reaches a new release, so not advised for production.I've made a sneaky test-image compatible with the 1.0.2 if you want to quickly see if not splitting the hostname works, though it might likely fail elsewhere, as I've not tested this at all, use at your peril :) mayadata/mayastor-csi:2a4f05e0b37b
Your hack seems to be holding so far...
NAME READY STATUS RESTARTS AGE
mayastor-csi-6njnh 2/2 Running 0 2m55s
mayastor-csi-jpk66 2/2 Running 0 2m55s
mayastor-csi-nn7vn 2/2 Running 0 2m55s
mayastor-csi-rnrxl 2/2 Running 0 2m55s
mayastor-etcd-0 1/1 Running 1 (127m ago) 131m
mayastor-etcd-1 1/1 Running 1 (127m ago) 131m
mayastor-etcd-2 0/1 Running 24 (7m4s ago) 131m
nats-0 2/2 Running 0 132m
nats-1 2/2 Running 0 132m
nats-2 2/2 Running 0 131m
This is broken on 1.0.2 AFAICT, we strip out the subdomain, which doesn't really work when your hostname is an ip address.
Well it certainly is broken for all hostnames with a period in them, so the typical FQDN hostname is affected, too.
v1.0.4 this still fails on eks where node name looks like ip-xxx-xx-x-x.eu-central-1.compute.internal
looks like https://github.com/kubernetes-csi/node-driver-registrar/issues/205 but I think its because mayastore is sending it the wrong hostname?
@rhrytskiv I had the same issue yesterday and then tested in develop and it worked fine so it has been addressed but not backported.
Hi, I think this has been fixed on develop
but not back ported.
To test this could you please add - "-N$(MY_NODE_NAME)" to the mayastor-daemonset arguments and see if it works? Thanks