kubernetes-csi-tencentcloud
kubernetes-csi-tencentcloud copied to clipboard
csi drivers on OpenShift cluster
hello we have had RedHat team testing our Tencent Cloud driver and here is the result of their investigation:
Below is response what I got from RedHat team. I had tried to install Tencent cloud csi drivers on OpenShift cluster on Tencent Cloud. Unfortunately none of them worked.
The steps I did as below:
- download images from tcr and push them to disconnected quay registry:
curl -O https://rhocp-41711-1332667311.cos.ap-nanjing.myqcloud.com/tcs-images/tcs-csi-provisioner.tar
curl -O https://rhocp-41711-1332667311.cos.ap-nanjing.myqcloud.com/tcs-images/tcs-csi-attacher.tar
curl -O https://rhocp-41711-1332667311.cos.ap-nanjing.myqcloud.com/tcs-images/tcs-csi-snapshotter.tar
curl -O https://rhocp-41711-1332667311.cos.ap-nanjing.myqcloud.com/tcs-images/tcs-snapshot-controller.tar
curl -O https://rhocp-41711-1332667311.cos.ap-nanjing.myqcloud.com/tcs-images/tcs-csi-resizer.tar
curl -O https://rhocp-41711-1332667311.cos.ap-nanjing.myqcloud.com/tcs-images/tcs-csi-tencentcloud-cbs.tar
curl -O https://rhocp-41711-1332667311.cos.ap-nanjing.myqcloud.com/tcs-images/tcs-csi-node-driver-registrar.tar
curl -O https://rhocp-41711-1332667311.cos.ap-nanjing.myqcloud.com/tcs-images/tcs-csi-tencentcloud-cfs.tar
curl -O https://rhocp-41711-1332667311.cos.ap-nanjing.myqcloud.com/tcs-images/tcs-csi-tencentcloud-cos.tar
curl -O https://rhocp-41711-1332667311.cos.ap-nanjing.myqcloud.com/tcs-images/tcs-csi-tencentcloud-cos-launcher.tar
curl -O https://rhocp-41711-1332667311.cos.ap-nanjing.myqcloud.com/tcs-images/busybox.tar
podman load -i tcs-csi-provisioner.tar
podman load -i tcs-csi-attacher.tar
podman load -i tcs-csi-snapshotter.tar
podman load -i tcs-snapshot-controller.tar
podman load -i tcs-csi-resizer.tar
podman load -i tcs-csi-tencentcloud-cbs.tar
podman load -i tcs-csi-node-driver-registrar.tar
podman load -i tcs-csi-tencentcloud-cfs.tar
podman load -i tcs-csi-tencentcloud-cos.tar
podman load -i tcs-csi-tencentcloud-cos-launcher.tar
podman load -i busybox.tar
podman tag ccr.ccs.tencentyun.com/tkeimages/csi-provisioner:v2.0.4 quay.ocp4.example.com:8443/tcr/tkeimages/csi-provisioner:v2.0.4
podman tag ccr.ccs.tencentyun.com/tkeimages/csi-attacher:v3.0.2 quay.ocp4.example.com:8443/tcr/tkeimages/csi-attacher:v3.0.2
podman tag ccr.ccs.tencentyun.com/tkeimages/csi-snapshotter:v3.0.2 quay.ocp4.example.com:8443/tcr/tkeimages/csi-snapshotter:v3.0.2
podman tag ccr.ccs.tencentyun.com/tkeimages/snapshot-controller:v3.0.2 quay.ocp4.example.com:8443/tcr/tkeimages/snapshot-controller:v3.0.2
podman tag ccr.ccs.tencentyun.com/tkeimages/csi-resizer:v1.0.1 quay.ocp4.example.com:8443/tcr/tkeimages/csi-resizer:v1.0.1
podman tag ccr.ccs.tencentyun.com/tkeimages/csi-tencentcloud-cbs:v2.3.3 quay.ocp4.example.com:8443/tcr/tkeimages/csi-tencentcloud-cbs:v2.3.3
podman tag ccr.ccs.tencentyun.com/tkeimages/csi-node-driver-registrar:v2.0.1 quay.ocp4.example.com:8443/tcr/tkeimages/csi-node-driver-registrar:v2.0.1
podman tag ccr.ccs.tencentyun.com/tkeimages/csi-tencentcloud-cfs:v2.0.6 quay.ocp4.example.com:8443/tcr/tkeimages/csi-tencentcloud-cfs:v2.0.6
podman tag ccr.ccs.tencentyun.com/tkeimages/csi-tencentcloud-cos:v2.0.2 quay.ocp4.example.com:8443/tcr/tkeimages/csi-tencentcloud-cos:v2.0.2
podman tag ccr.ccs.tencentyun.com/tkeimages/csi-tencentcloud-cos-launcher:v2.0.2 quay.ocp4.example.com:8443/tcr/tkeimages/csi-tencentcloud-cos-launcher:v2.0.2
podman tag docker.io/library/busybox:stable-glibc quay.ocp4.example.com:8443/tcr/busybox:latest
podman push quay.ocp4.example.com:8443/tcr/tkeimages/csi-provisioner:v2.0.4
podman push quay.ocp4.example.com:8443/tcr/tkeimages/csi-attacher:v3.0.2
podman push quay.ocp4.example.com:8443/tcr/tkeimages/csi-snapshotter:v3.0.2
podman push quay.ocp4.example.com:8443/tcr/tkeimages/snapshot-controller:v3.0.2
podman push quay.ocp4.example.com:8443/tcr/tkeimages/csi-resizer:v1.0.1
podman push quay.ocp4.example.com:8443/tcr/tkeimages/csi-tencentcloud-cbs:v2.3.3
podman push quay.ocp4.example.com:8443/tcr/tkeimages/csi-node-driver-registrar:v2.0.1
podman push quay.ocp4.example.com:8443/tcr/tkeimages/csi-tencentcloud-cfs:v2.0.6
podman push quay.ocp4.example.com:8443/tcr/tkeimages/csi-tencentcloud-cos:v2.0.2
podman push quay.ocp4.example.com:8443/tcr/tkeimages/csi-tencentcloud-cos-launcher:v2.0.2
podman push quay.ocp4.example.com:8443/tcr/busybox:latest
- apply image tag mirror set and switch default project to kube-system:
oc apply -f tcr-itms.yaml
oc project kube-system
you have to replace "quay.ocp4.example.com:8443" to your quay registry url in tcr-itms.yaml file.
- deploy cbs csi driver which I referenced by https://github.com/TencentCloud/kubernetes-csi-tencentcloud/blob/master/docs/README_CBS.md :
oc apply -f cbs-secret.yaml
oc apply -f cbs-csi-node-rbac.yaml
oc apply -f cbs-csi-node.yaml
oc apply -f cbs-csi-controller-rbac.yaml
oc apply -f cbs-csi-controller.yaml
oc apply -f cbs-storageclass.yaml
oc apply -f cbs-test-pvc.yaml
oc apply -f cbs-test-pod.yaml
you have to replace TENCENTCLOUD_CBS_API_SECRET_ID/TENCENTCLOUD_CBS_API_SECRET_KEY's value as your tencent cloud secret id/key's base64 string in cbs-secret.yaml file. The test-cbs-pvc and related pv were pending to create, it looked like csi-cbs-node pods were not stable, socket can't connected:
oc get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
cbs-csi com.tencent.cloud.csi.cbs Delete Immediate false 5m16s
oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
test-cbs-pvc Pending cbs-csi <unset> 16m
oc get pv
No resources found
oc get pod
NAME READY STATUS RESTARTS AGE
cbs-test-app 0/1 Pending 0 2s
csi-cbs-controller-fc5f44946-8kgs8 6/6 Running 0 38m
csi-cbs-node-64cnw 2/2 Running 5 (3m10s ago) 22m
csi-cbs-node-94wv6 2/2 Running 2 (4m26s ago) 22m
csi-cbs-node-gtmpt 2/2 Running 5 (2m59s ago) 22m
csi-cbs-node-lm9s7 1/2 CrashLoopBackOff 4 (72s ago) 22m
csi-cbs-node-ntgtd 2/2 Running 3 (4m43s ago) 22m
csi-cbs-node-znrmj 2/2 Running 3 (104s ago) 21m
oc logs csi-cbs-node-lm9s7 -c cbs-csi
I0406 06:40:40.610723 27585 main.go:44] Building kube configs for running in cluster...
I0406 06:40:40.612635 27585 driver.go:48] Driver: com.tencent.cloud.csi.cbs version: v2.3.3
oc logs csi-cbs-node-lm9s7 -c driver-registrar
I0406 06:17:43.520604 18048 main.go:112] Version: v2.0.1
I0406 06:17:43.520642 18048 main.go:122] Attempting to open a gRPC connection with: "/csi/csi.sock"
I0406 06:17:43.520652 18048 connection.go:151] Connecting to unix:///csi/csi.sock
W0406 06:17:53.520744 18048 connection.go:170] Still connecting to unix:///csi/csi.sock
- deploy cfs csi driver which I referenced by https://github.com/TencentCloud/kubernetes-csi-tencentcloud/blob/master/docs/README_CFS.md :
oc apply -f cfs-secret.yaml
oc apply -f cfs-csi-rbac.yaml
oc apply -f cfs-csi-driver.yaml
oc apply -f cfs-csi-nodeplugin.yaml
oc apply -f cfs-csi-provisioner.yaml
oc apply -f cfs-storageclass.yaml
oc apply -f cfs-test-pvc.yaml
oc apply -f cfs-test-pod.yaml
you have to replace TENCENTCLOUD_CFS_API_SECRET_ID/TENCENTCLOUD_CFS_API_SECRET_KEY's value as your tencent cloud secret id/key's base64 string in cfs-secret.yaml file. The test-cfs-pvc and related pv were pending to create, even if I created CFS service manually. It looked like :
oc get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
cfs-csi com.tencent.cloud.csi.cfs Delete Immediate false 3m47s
oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
test-cfs-pvc Pending cfs-csi <unset> 43s
oc get pv
No resources found
oc get pod
NAME READY STATUS RESTARTS AGE
cfs-csi-app 0/1 Pending 0 3s
csi-nodeplugin-cfsplugin-2pwnw 2/2 Running 0 7m42s
csi-nodeplugin-cfsplugin-52drh 2/2 Running 0 7m42s
csi-nodeplugin-cfsplugin-8fxv8 2/2 Running 0 7m42s
csi-nodeplugin-cfsplugin-c4qlg 2/2 Running 0 7m42s
csi-nodeplugin-cfsplugin-g8j5c 2/2 Running 0 7m42s
csi-nodeplugin-cfsplugin-npdp5 2/2 Running 0 7m42s
csi-provisioner-cfsplugin-0 2/2 Running 0 5m17s
oc get event
1m Warning ProvisioningFailed persistentvolumeclaim/test-cfs-pvc failed to provision volume with StorageClass "cfs-csi": rpc error: code = Internal desc = [TencentCloudSDKError] Code=ClientError.NetworkError, Message=Fail to get response because Post "https://cfs.internal.tencentcloudapi.com/": net/http: invalid header field value "TC3-HMAC-SHA256 Credential=AKIDZzEF3e5JK8j7IH3cz84nZBbqshJkxymM\n/2025-04-06/cfs/tc3_request, SignedHeaders=content-type;host, Signature=b5bd5cae03a9f9bfe308b308c541c747bdee039860d68901ba6ea4165878a5f6" for key Authorization, RequestId=
- deploy cosfs csi driver which I referenced by https://github.com/TencentCloud/kubernetes-csi-tencentcloud/blob/master/docs/README_COSFS.md :
oc apply -f cosfs-csi-driver.yaml
oc apply -f cosfs-csi-launcher.yaml
oc apply -f cosfs-csi-node-rbac.yaml
oc apply -f cosfs-csi-node.yaml
It looked like csi-cosplugin pods failed to start. I found nothing about configmap cos-lite in their documents:
oc get pod
NAME READY STATUS RESTARTS AGE
csi-coslauncher-5w4tq 1/1 Running 0 11m
csi-coslauncher-c4pvg 1/1 Running 0 11m
csi-coslauncher-fqq5h 1/1 Running 0 11m
csi-cosplugin-4fbsv 1/2 CrashLoopBackOff 7 (22s ago) 11m
csi-cosplugin-q8l52 1/2 CrashLoopBackOff 7 (33s ago) 11m
csi-cosplugin-rr2qq 1/2 CrashLoopBackOff 7 (26s ago) 11m
oc logs csi-cosplugin-4fbsv -c cosfs
I0406 08:11:19.141429 1 main.go:40] Building clientset for running in cluster...
I0406 08:11:19.142693 1 lite_config.go:326] start Init common liteConfigMap ...
F0406 08:11:19.148332 1 main.go:52] failed to initLiteConfigMap, err: fail get cm: kube-system/cos-lite, err: configmaps "cos-lite" is forbidden: User "system:serviceaccount:kube-system:csi-cos-tencentcloud" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
cbs issue
for "csi-cbs-node-lm9s7 1/2 CrashLoopBackOff", please
a. check --root-dir parameter of kubelet
b. try to delete and recreate that pod
cfs issue
- for k8s >=1.18, please use yaml with new suffix:
kubectl apply -f deploy/cfs/kubernetes/csi-nodeplugin-cfsplugin-new.yaml kubectl apply -f deploy/cfs/kubernetes/csi-provisioner-cfsplugin-new.yaml
please follow https://github.com/TencentCloud/kubernetes-csi-tencentcloud/blob/master/docs/README_CFS.md
- when base64 encoding TENCENTCLOUD_CFS_API_SECRET_ID/TENCENTCLOUD_CFS_API_SECRET_KEY, please add -n:
echo -n <ID-or-KEY> | base64
cos issue
please kubectl edit clusterrole csi-cos-tencentcloud and add following:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "create", "delete", "update"]
And for cfs issue, when base64 encoding TENCENTCLOUD_CFS_API_SECRET_ID/TENCENTCLOUD_CFS_API_SECRET_KEY, please add -n:
echo -n <ID-or-KEY> | base64