Bug: datashim containers keep crashing on version 0.4.1 "Failed to connect to the CSI driver"
What happened:
All datashim containers keep crashing on version 0.4.1 0.4.0 release works perfectly normal
kubectl -n dlf logs pod/csi-attacher-s3-0
I0311 15:16:58.441647 1 main.go:109] "Version" version="v4.7.0"
I0311 15:16:58.443204 1 connection.go:234] "Connecting" address="unix:///csi/csi.sock"
I0311 15:17:08.443973 1 connection.go:253] "Still connecting" address="unix:///csi/csi.sock"
I0311 15:17:18.444043 1 connection.go:253] "Still connecting" address="unix:///csi/csi.sock"
I0311 15:17:28.443460 1 connection.go:253] "Still connecting" address="unix:///csi/csi.sock"
E0311 15:17:28.443539 1 main.go:149] "Failed to connect to the CSI driver" err="context deadline exceeded" csiAddress="/csi/csi.sock"
kubectl -n dlf get all
NAME READY STATUS RESTARTS AGE
pod/csi-attacher-s3-0 0/1 CrashLoopBackOff 4 (71s ago) 5m19s
pod/csi-provisioner-s3-0 0/1 CrashLoopBackOff 4 (79s ago) 5m19s
pod/csi-s3-29jdp 0/2 CrashLoopBackOff 9 (83s ago) 5m19s
pod/csi-s3-snjj7 0/2 CrashLoopBackOff 9 (82s ago) 5m18s
pod/csi-s3-tzrgv 0/2 CrashLoopBackOff 9 (67s ago) 5m19s
pod/dataset-operator-78555d79d6-98k5s 1/1 Running 0 5m20s
What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
It happens on all of my clusters
Anything else we need to know?:
Environment:
-
Datashim version: 0.4.1
-
Kubernetes version (use
kubectl version): Client Version: v1.32.1 Kustomize Version: v5.5.0 Server Version: v1.32.1 -
Kubernetes distribution: normal kubernetes
-
Cloud provider or hardware configuration: qemu/kvm
-
OS (e.g:
cat /etc/os-release): Gentoo -
Kernel (e.g.
uname -a): 6.6.74-gentoo-dist -
Install tools: kubeadm
-
Others:
@adippl apologies for the delay and thanks for the bug report. We'll try to reproduce it on our end
Hello, Just got the same error while deploying 0.4.1 with helm chart on last revision. Rev 0.4.0 works fine as well.
Client Version: v1.32.3 Kustomize Version: v5.5.0 Server Version: v1.31.7+rke2r1 OS release: Ubuntu 22.04.5 LTS" Hardware: KVM + Host CPU (6 cores + 12Gbram)
kubectl -n datashim logs pod/csi-attacher-s3-0
I0409 11:20:09.906957 1 main.go:109] "Version" version="v4.7.0"
I0409 11:20:09.907682 1 connection.go:234] "Connecting" address="unix:///csi/csi.sock"
I0409 11:20:19.908374 1 connection.go:253] "Still connecting" address="unix:///csi/csi.sock"
I0409 11:20:29.908428 1 connection.go:253] "Still connecting" address="unix:///csi/csi.sock"
I0409 11:20:39.908314 1 connection.go:253] "Still connecting" address="unix:///csi/csi.sock"
E0409 11:20:39.908400 1 main.go:149] "Failed to connect to the CSI driver" err="context deadline exceeded" csiAddress="/csi/csi.sock"
kubectl -n datashim logs pod/csi-nodeplugin-nfsplugin-6zkrg
Defaulted container "node-driver-registrar" out of: node-driver-registrar, nfs
I0409 11:26:11.140098 1 main.go:150] "Version" version="v1.12.0"
I0409 11:26:11.140169 1 main.go:151] "Running node-driver-registrar" mode=""
I0409 11:26:11.140174 1 main.go:172] "Attempting to open a gRPC connection" csiAddress="/plugin/csi.sock"
I0409 11:26:11.140184 1 connection.go:234] "Connecting" address="unix:///plugin/csi.sock"
I0409 11:26:21.140564 1 connection.go:253] "Still connecting" address="unix:///plugin/csi.sock"
I0409 11:26:31.140493 1 connection.go:253] "Still connecting" address="unix:///plugin/csi.sock"
I0409 11:26:41.141151 1 connection.go:253] "Still connecting" address="unix:///plugin/csi.sock"
E0409 11:26:41.141222 1 main.go:176] "Error connecting to CSI driver" err="context deadline exceeded"
@celi28 @adippl s3driver and nfs-plugin images in the official helm chart uses the 8f50a01(0.4.1) tag, which may not support linux/amd64. I changed it to the latest tag manually, and it can run successfully.
kubectl logs -ndlf csi-s3-4r7ws -c csi-s3
exec /s3driver: exec format error
cc @srikumar003
@celi28 @adippl s3driver and nfs-plugin images in the official helm chart uses the 8f50a01(0.4.1) tag, which may not support linux/amd64. I changed it to the latest tag manually, and it can run successfully.
kubectl logs -ndlf csi-s3-4r7ws -c csi-s3 exec /s3driver: exec format error
cc @srikumar003
I can confirm, such workaround works, e.g.:
helm install datashim datashim/datashim-charts \
--version v0.4.1 \
--namespace dlf \
-f -<<EOF
csi-nfs-chart:
enabled: false
csi-s3-chart:
enabled: true
csis3:
image: csi-s3
tag: latest
EOF