ceph-csi icon indicating copy to clipboard operation
ceph-csi copied to clipboard

rbd: implement GetSnapshot CSI procedure

Open nixpanic opened this issue 8 months ago • 10 comments

A new GetSnapshot CSI procedure is introduced.

Depends-on: container-storage-interface/spec#586 See-also: kubernetes/enhancements#5013

Note: This currently uses unmerged changes to the CSI specification, hence this PR is a Draft.

Rook is getting prepared for the CSI sidecar updates to prevent panics too (rook/rook#15878).


Show available bot commands

These commands are normally not required, but in case of issues, leave any of the following bot commands in an otherwise empty comment in this PR:

  • /retest ci/centos/<job-name>: retest the <job-name> after unrelated failure (please report the failure too!)

nixpanic avatar Apr 28 '25 16:04 nixpanic

/test ci/centos/mini-e2e/k8s-1.32

nixpanic avatar May 15 '25 15:05 nixpanic

/test ci/centos/mini-e2e/k8s-1.32

The resizes sidecar does not seem to like the capability (it won't know about it):

  I0515 16:24:34.336915 80385 log.go:56] Logs of cephcsi-e2e-bf9a0dd6/csi-rbdplugin-provisioner-84c67cdf4-mszb2:csi-resizer on node minikube

  I0515 16:24:34.336948 80385 log.go:57] STARTLOG

  I0515 16:20:18.786218       1 main.go:111] "Version" version="v1.13.1"
  I0515 16:20:18.787054       1 feature_gate.go:387] feature gates: {map[RecoverVolumeExpansionFailure:true]}
  I0515 16:20:18.787221       1 envvar.go:172] "Feature gate default state" feature="ClientsAllowCBOR" enabled=false
  I0515 16:20:18.787240       1 envvar.go:172] "Feature gate default state" feature="ClientsPreferCBOR" enabled=false
  I0515 16:20:18.787245       1 envvar.go:172] "Feature gate default state" feature="InformerResourceVersion" enabled=false
  I0515 16:20:18.787249       1 envvar.go:172] "Feature gate default state" feature="WatchListClient" enabled=false
  I0515 16:20:18.788215       1 connection.go:234] "Connecting" address="unix:///csi/csi-provisioner.sock"
  I0515 16:20:18.789799       1 common.go:143] "Probing CSI driver for readiness"
  I0515 16:20:18.789842       1 connection.go:264] "GRPC call" method="/csi.v1.Identity/Probe" request="{}"
  I0515 16:20:18.791060       1 connection.go:270] "GRPC response" response="{}" err=null
  I0515 16:20:18.791083       1 connection.go:264] "GRPC call" method="/csi.v1.Identity/GetPluginInfo" request="{}"
  I0515 16:20:18.792150       1 connection.go:270] "GRPC response" response="{\"name\":\"rbd.csi.ceph.com\",\"vendor_version\":\"canary\"}" err=null
  I0515 16:20:18.792173       1 main.go:166] "CSI driver name" driverName="rbd.csi.ceph.com"
  I0515 16:20:18.792183       1 connection.go:264] "GRPC call" method="/csi.v1.Identity/GetPluginCapabilities" request="{}"
  I0515 16:20:18.792798       1 connection.go:270] "GRPC response" response="{\"capabilities\":[{\"service\":{\"type\":\"CONTROLLER_SERVICE\"}},{\"volume_expansion\":{\"type\":\"ONLINE\"}},{\"service\":{\"type\":\"VOLUME_ACCESSIBILITY_CONSTRAINTS\"}}]}" err=null
  I0515 16:20:18.792822       1 connection.go:264] "GRPC call" method="/csi.v1.Controller/ControllerGetCapabilities" request="{}"
  panic: runtime error: invalid memory address or nil pointer dereference
  [signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0xa97ae2]

  goroutine 1 [running]:
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.stripSingleValue({0x24249c8, 0xc0000fefc0}, {{}, 0x1c60a00?, 0x0?, 0x1c5c140?})
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:71 +0x82
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.stripValue({0x24249c8, 0xc0000fefc0}, {{}, 0x1c60a00?, 0x0?, 0xc00017a5c8?})
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:94 +0x1b8
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.stripMessage.func1({0x24249c8, 0xc0000fefc0}, {{}, 0x1c60a00?, 0x0?, 0xc00025d770?})
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:108 +0xdc
  google.golang.org/protobuf/internal/impl.(*messageState).Range(0xc00017a5a0, 0xc00025d770)
  	/workspace/vendor/google.golang.org/protobuf/internal/impl/message_reflect_gen.go:51 +0x1e3
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.stripMessage({0x2415050, 0xc00017a5a0})
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:103 +0x6d
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.stripSingleValue({0x24249c8, 0xc0000ff0e0}, {{}, 0x2074fa0?, 0xc00017a5a0?, 0x1c5c140?})
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:69 +0xd4
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.stripValue({0x24249c8, 0xc0000ff0e0}, {{}, 0x2074fa0?, 0xc00017a5a0?, 0xc0004bb680?})
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:94 +0x1b8
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.stripMessage.func1({0x24249c8, 0xc0000ff0e0}, {{}, 0x2074fa0?, 0xc00017a5a0?, 0xc00025d760?})
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:108 +0xdc
  google.golang.org/protobuf/internal/impl.(*messageState).Range(0xc000120040, 0xc00025d760)
  	/workspace/vendor/google.golang.org/protobuf/internal/impl/message_reflect_gen.go:58 +0x130
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.stripMessage({0x2415050, 0xc000120040})
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:103 +0x6d
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.stripSingleValue({0x24249c8, 0xc0000fed80}, {{}, 0x2074fa0?, 0xc000120040?, 0x1c5c140?})
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:69 +0xd4
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.stripValue({0x24249c8, 0xc0000fed80}, {{}, 0x1f7d040?, 0xc00017a690?, 0xc00025e700?})
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:82 +0x20a
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.stripMessage.func1({0x24249c8, 0xc0000fed80}, {{}, 0x1f7d040?, 0xc00017a690?, 0xc00025d670?})
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:108 +0xdc
  google.golang.org/protobuf/internal/impl.(*messageState).Range(0xc00026ed40, 0xc00025d670)
  	/workspace/vendor/google.golang.org/protobuf/internal/impl/message_reflect_gen.go:51 +0x1e3
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.stripMessage({0x2415050, 0xc00026ed40})
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:103 +0x6d
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.(*stripSecrets).String(0xc0004bba10)
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:56 +0x4d
  github.com/kubernetes-csi/csi-lib-utils/connection.LogGRPC({0x23fbf38, 0xc00029ea10}, {0x2132464, 0x2c}, {0x1ea2f20, 0xc000495320}, {0x1ed10a0, 0xc00026ed40}, 0xc000334808, 0xc000495350, ...)
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/connection/connection.go:266 +0x290
  google.golang.org/grpc.NewClient.chainUnaryClientInterceptors.func1({0x23fbf38, 0xc00029ea10}, {0x2132464, 0x2c}, {0x1ea2f20, 0xc000495320}, {0x1ed10a0, 0xc00026ed40}, 0xc000334808, 0x220af80, ...)
  	/workspace/vendor/google.golang.org/grpc/clientconn.go:458 +0x118
  google.golang.org/grpc.(*ClientConn).Invoke(0xc000334808, {0x23fbf38?, 0xc00029ea10?}, {0x2132464?, 0xc000495320?}, {0x1ea2f20?, 0xc000495320?}, {0x1ed10a0?, 0xc00026ed40?}, {0x0, ...})
  	/workspace/vendor/google.golang.org/grpc/call.go:35 +0x205
  github.com/container-storage-interface/spec/lib/go/csi.(*controllerClient).ControllerGetCapabilities(0xc0004bbc38, {0x23fbf38, 0xc00029ea10}, 0xc000495320, {0x0, 0x0, 0x0})
  	/workspace/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi_grpc.pb.go:297 +0xc8
  github.com/kubernetes-csi/csi-lib-utils/rpc.GetControllerCapabilities({0x23fbf38, 0xc00029ea10}, 0x35da6a0?)
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/rpc/common.go:87 +0x6a
  github.com/kubernetes-csi/external-resizer/pkg/csi.(*client).SupportsControllerResize(0x23fbe58?, {0x23fbf38?, 0xc00029ea10?})
  	/workspace/pkg/csi/client.go:106 +0x2d
  github.com/kubernetes-csi/external-resizer/pkg/resizer.supportsControllerResize({0x24083e0, 0xc0000d0db0}, 0xc000320780?)
  	/workspace/pkg/resizer/csi_resizer.go:250 +0x67
  github.com/kubernetes-csi/external-resizer/pkg/resizer.NewResizerFromClient({0x24083e0, 0xc0000d0db0}, 0x22ecb25c00, {0x2426e88, 0xc000104c40}, {0xc000240260, 0x10})
  	/workspace/pkg/resizer/csi_resizer.go:59 +0xb4
  main.main()
  	/workspace/cmd/csi-resizer/main.go:180 +0x9e5

nixpanic avatar May 16 '25 07:05 nixpanic

The resizes sidecar does not seem to like the capability (it won't know about it):

  I0515 16:24:34.336915 80385 log.go:56] Logs of cephcsi-e2e-bf9a0dd6/csi-rbdplugin-provisioner-84c67cdf4-mszb2:csi-resizer on node minikube

  I0515 16:24:34.336948 80385 log.go:57] STARTLOG

  I0515 16:20:18.786218       1 main.go:111] "Version" version="v1.13.1"
  I0515 16:20:18.787054       1 feature_gate.go:387] feature gates: {map[RecoverVolumeExpansionFailure:true]}
  I0515 16:20:18.787221       1 envvar.go:172] "Feature gate default state" feature="ClientsAllowCBOR" enabled=false
  I0515 16:20:18.787240       1 envvar.go:172] "Feature gate default state" feature="ClientsPreferCBOR" enabled=false
  I0515 16:20:18.787245       1 envvar.go:172] "Feature gate default state" feature="InformerResourceVersion" enabled=false
  I0515 16:20:18.787249       1 envvar.go:172] "Feature gate default state" feature="WatchListClient" enabled=false
  I0515 16:20:18.788215       1 connection.go:234] "Connecting" address="unix:///csi/csi-provisioner.sock"
  I0515 16:20:18.789799       1 common.go:143] "Probing CSI driver for readiness"
  I0515 16:20:18.789842       1 connection.go:264] "GRPC call" method="/csi.v1.Identity/Probe" request="{}"
  I0515 16:20:18.791060       1 connection.go:270] "GRPC response" response="{}" err=null
  I0515 16:20:18.791083       1 connection.go:264] "GRPC call" method="/csi.v1.Identity/GetPluginInfo" request="{}"
  I0515 16:20:18.792150       1 connection.go:270] "GRPC response" response="{\"name\":\"rbd.csi.ceph.com\",\"vendor_version\":\"canary\"}" err=null
  I0515 16:20:18.792173       1 main.go:166] "CSI driver name" driverName="rbd.csi.ceph.com"
  I0515 16:20:18.792183       1 connection.go:264] "GRPC call" method="/csi.v1.Identity/GetPluginCapabilities" request="{}"
  I0515 16:20:18.792798       1 connection.go:270] "GRPC response" response="{\"capabilities\":[{\"service\":{\"type\":\"CONTROLLER_SERVICE\"}},{\"volume_expansion\":{\"type\":\"ONLINE\"}},{\"service\":{\"type\":\"VOLUME_ACCESSIBILITY_CONSTRAINTS\"}}]}" err=null
  I0515 16:20:18.792822       1 connection.go:264] "GRPC call" method="/csi.v1.Controller/ControllerGetCapabilities" request="{}"
  panic: runtime error: invalid memory address or nil pointer dereference
  [signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0xa97ae2]

  goroutine 1 [running]:
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.stripSingleValue({0x24249c8, 0xc0000fefc0}, {{}, 0x1c60a00?, 0x0?, 0x1c5c140?})
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:71 +0x82
  github.com/kubernetes-csi/csi-lib-utils/protosanitizer.stripValue({0x24249c8, 0xc0000fefc0}, {{}, 0x1c60a00?, 0x0?, 0xc00017a5c8?})
  	/workspace/vendor/github.com/kubernetes-csi/csi-lib-utils/protosanitizer/protosanitizer.go:94 +0x1b8

This has been fixed with kubernetes-csi/csi-lib-utils#188. A build of the external-resizer sidecar needs csi-lib-utils v0.21.0 or newer.

nixpanic avatar May 16 '25 10:05 nixpanic

/test ci/centos/mini-e2e/k8s-1.32

nixpanic avatar May 16 '25 12:05 nixpanic

/test ci/centos/mini-e2e/k8s-1.32

nixpanic avatar May 16 '25 15:05 nixpanic

/test ci/centos/mini-e2e/k8s-1.32

nixpanic avatar May 20 '25 08:05 nixpanic

/test ci/centos/mini-e2e/k8s-1.32

nixpanic avatar May 20 '25 11:05 nixpanic

/test ci/centos/mini-e2e/k8s-1.32

nixpanic avatar May 20 '25 12:05 nixpanic

Yay, success! :partying_face:

@yati1998, would it help to have Controller.GetSnapshot merged for testing with fixes in the external-snapshotter?

nixpanic avatar May 20 '25 15:05 nixpanic

/test ci/centos/mini-e2e/k8s-1.32

nixpanic avatar Jun 05 '25 15:06 nixpanic

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jul 05 '25 21:07 github-actions[bot]

This pull request now has conflicts with the target branch. Could you please resolve conflicts and force push the corrected changes? 🙏

mergify[bot] avatar Jul 05 '25 21:07 mergify[bot]

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Aug 05 '25 21:08 github-actions[bot]

This pull request has been automatically closed due to inactivity. Please re-open if these changes are still required.

github-actions[bot] avatar Aug 20 '25 21:08 github-actions[bot]

@nixpanic, With the progress on VGS v1beta2 APIs, maybe we re-open this now?

black-dragon74 avatar Sep 09 '25 11:09 black-dragon74