external-snapshotter icon indicating copy to clipboard operation
external-snapshotter copied to clipboard

Update VolumeGroupSnapshot to v1beta1

Open leonardoce opened this issue 1 year ago • 8 comments

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind api-change

/kind bug /kind cleanup /kind design /kind documentation /kind failing-test /kind feature /kind flake

What this PR does / why we need it:

Update the definition of CRDs to move VolumeGroupSnapshot, VolumeGroupSnapshotContent, and VolumeGroupSnapshotClass from v1alpha1 to v1beta1.

A second commit adds and sets the new field VolumeSnapshotHandlePairList of the VolumeGroupSnapshotContent resource, as described in the latest iteration of the relative KEP.

cc: @xing-yang Which issue(s) this PR fixes:

Fixes #1119

Special notes for your reviewer:

The updated CRD still allows the API server to serve v1alpha1, allowing the current clients to continue working correctly. We may unset the served flag later, or decide to set it to false now.

This table is a summary of it:

Version Stored Served
v1alpha1 false true
v1beta1 true true

There's no conversion between different versions, as the structure is consistent.

Does this PR introduce a user-facing change?:

`VolumeGroupSnapshot`, `VolumeGroupSnapshotContent`, and `VolumeGroupSnapshotClass`
are now available in `v1beta1` version.
The structure of the different versions are consistent, with the `v1beta1` API having an optional added optional field in `VolumeGroupSnapshotContent`. The `v1alpha1` API is still being served.

leonardoce avatar Aug 30 '24 10:08 leonardoce

/cc xing-yang

nixpanic avatar Oct 01 '24 14:10 nixpanic

I rebased this PR and added a commit to add and set the new field VolumeSnapshotHandlePairList of the VolumeGroupSnapshotContent resource, as described in the latest iteration of the relative KEP.

Thank you!

cc: @xing-yang

leonardoce avatar Oct 18 '24 06:10 leonardoce

@leonardoce We discussed this PR during today's CSI Implementation meeting. @jsafrane suggested that we break this PR into two PRs.

  • PR 1: Add new field and populate the new field but keeping the APIs at alpha.
  • PR 2: Move the APIs to Beta. There are other things we need to do between PR 1 and PR2. See detailed plan here: https://docs.google.com/document/d/1jjliw0tFhCPnT9ixlYpj61k5u8_-awf1qSCWrkunshQ/edit?tab=t.0

xing-yang avatar Oct 21 '24 17:10 xing-yang

Ok. I'll close it and file the one to add the new field

leonardoce avatar Oct 21 '24 18:10 leonardoce

Ok. I'll close it and file the one to add the new field @leonardoce can we please have any timeline by when can we get this done?

yati1998 avatar Oct 23 '24 10:10 yati1998

Hi @yati1998. I filed PR https://github.com/kubernetes-csi/external-snapshotter/pull/1169 to add the new field.

leonardoce avatar Oct 23 '24 15:10 leonardoce

/draft

leonardoce avatar Oct 23 '24 15:10 leonardoce

https://github.com/kubernetes-csi/external-snapshotter/pull/1169 has been merged - can we merge this PR now?

mulbc avatar Oct 28 '24 09:10 mulbc

@leonardoce can I know why this PR is in draft? I think we were good on this part?

yati1998 avatar Nov 06 '24 11:11 yati1998

Hi @yati1998. According to this plan: https://docs.google.com/document/d/1jjliw0tFhCPnT9ixlYpj61k5u8_-awf1qSCWrkunshQ/edit?tab=t.0#heading=h.fizenae5evi0 We need at least https://github.com/kubernetes-csi/external-snapshotter/pull/1177 to be merged before this one.

leonardoce avatar Nov 06 '24 12:11 leonardoce

@leonardoce can you update this PR to resolve the conflicts?

gnufied avatar Nov 19 '24 18:11 gnufied

Can you please rebase?

xing-yang avatar Nov 21 '24 03:11 xing-yang

@gnufied can you please review this as well?

yati1998 avatar Nov 22 '24 02:11 yati1998

@leonardoce Please ignore my comments:)

xing-yang avatar Nov 22 '24 18:11 xing-yang

We should compile this branch rebased with master and then run e2e tests in a loop for sometime. I have often done this for testing pre-release code which needs more testing. This will give us greater confidence about this feature. This does not require making a release.

gnufied avatar Nov 22 '24 18:11 gnufied

I noticed that following field in volumeGroupSnapshot.Status is user editable after setting it:

BoundVolumeGroupSnapshotContentName *string `json:"boundVolumeGroupSnapshotContentName,omitempty" protobuf:"bytes,1,opt,name=boundVolumeGroupSnapshotContentName"`

Should this be immutable after creation?

gnufied avatar Nov 23 '24 03:11 gnufied

I noticed that following field in volumeGroupSnapshot.Status is user editable after setting it:

BoundVolumeGroupSnapshotContentName *string json:"boundVolumeGroupSnapshotContentName,omitempty" protobuf:"bytes,1,opt,name=boundVolumeGroupSnapshotContentName" Should this be immutable after creation?

This field is in the Status. Are we supposed to make fields in Status immutable? We have not done that for VolumeSnapshot APIs.

xing-yang avatar Nov 23 '24 15:11 xing-yang

This field is in the Status. Are we supposed to make fields in Status immutable? We have not done that for VolumeSnapshot APIs.

It would have been nice to have. But not a blocker I think.

gnufied avatar Nov 24 '24 14:11 gnufied

@gnufied @xing @leonardoce i have tested this PR + #1219 https://jenkins-ceph-csi.apps.ocp.cloud.ci.centos.org/blue/organizations/jenkins/mini-e2e_k8s-1.31/detail/mini-e2e_k8s-1.31/201/pipeline/ https://github.com/ceph/ceph-csi/pull/4974 (we already have framework for volumegroupsnapshot)

This is what it does

  • Create 3 PVC
  • Create a groupsnapshot
  • Create a clones from the 3 snapshots (as part of group)
  • Create pod using the clones
  • Delete the pod
  • Delete the clones
  • Delete the group
  • Delete the PVC

Repeat the above in a loop of 20 , let me know if you guys wants me to test anything else?

Madhu-1 avatar Nov 25 '24 12:11 Madhu-1

Hey, I tested this using the CSI HostPath Driver in a GitHub Action. The test, which can be viewed here, includes the following steps:

  1. Create 3 PVCs and mount them to a Pod.
  2. Create a VolumeGroupSnapshot.
  3. Restore PVCs from VolumeSnapshots, mount them to a Pod.
  4. Delete the VolumeGroupSnapshot.
  5. Delete the Pods and PVCs.

This sequence is repeated 20 times.

cc @Madhu-1

iPraveenParihar avatar Dec 02 '24 08:12 iPraveenParihar

I have run the k8s e2e test on vgs. It is passing.

Conformance test: not doing test setup.
  I1202 21:02:28.820217 1196005 e2e.go:109] Starting e2e run "e05caadd-9770-4603-828c-799227b70b3f" on Ginkgo node 1
Running Suite: Kubernetes e2e suite - /home/myathnalli/data/work/k8s/kubernetes/_output/bin
===========================================================================================
Random Seed: 1733153547 - will randomize all specs

Will run 21 of 6622 specs
••Checking for custom logdump instances, if any
----------------------------------------------------------------------------------------------------
k/k version of the log-dump.sh script is deprecated!
Please migrate your test job to use test-infra's repo version of log-dump.sh!
Migration steps can be found in the readme file.
----------------------------------------------------------------------------------------------------
Sourcing kube-util.sh
Detecting project
Skeleton Provider: detect-project not implemented
Dumping logs from master locally to '/home/myathnalli/data/work/k8s/kubernetes/_artifacts'
KUBE_MASTER_IP:
KUBE_MASTER:
/home/myathnalli/data/work/k8s/kubernetes/cluster/log-dump/log-dump.sh: line 385: MASTER_NAME: unbound variable


Ran 2 of 6622 Specs in 428.863 seconds
SUCCESS! -- 2 Passed | 0 Failed | 0 Pending | 6620 Skipped
PASS

Ginkgo ran 1 suite in 7m10.968249776s
Test Suite Passed
+ cleanup
+ '[' false = true ']'
+ '[' true = true ']'
+ kind export logs /home/myathnalli/data/work/k8s/kubernetes/_artifacts
Exporting logs for cluster "kind" to:
/home/myathnalli/data/work/k8s/kubernetes/_artifacts
+ kind delete cluster
Deleting cluster "kind" ...
Deleted nodes: ["kind-control-plane" "kind-worker2" "kind-worker"]
+ rm -f _output/bin/e2e.test
+ '[' -n /tmp/tmp.CQSRZTuJUg ']'
+ rm -rf /tmp/tmp.CQSRZTuJUg
+ CLEANED_UP=true
+ exit 0

manishym avatar Dec 02 '24 15:12 manishym

@gnufied The above is the results for e2e tests

Madhu-1 avatar Dec 02 '24 15:12 Madhu-1

I rebased it over the new master and tested it. My smoke tests look good.

leonardoce avatar Dec 02 '24 19:12 leonardoce

/lgtm

gnufied avatar Dec 02 '24 20:12 gnufied

/lgtm /approve

xing-yang avatar Dec 02 '24 21:12 xing-yang

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: leonardoce, xing-yang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot avatar Dec 02 '24 21:12 k8s-ci-robot

This is great news!!! Amazing work, well done! Congratulations!

gbartolini avatar Dec 03 '24 08:12 gbartolini

API review lgtm

msau42 avatar Dec 09 '24 19:12 msau42

I was wondering how I can install the update when I am getting the following error CustomResourceDefinition.apiextensions.k8s.io "volumegroupsnapshots.groupsnapshot.storage.k8s.io" is invalid: status.storedVersions[0]: Invalid value: "v1alpha1": must appear in spec.versions

See https://github.com/piraeusdatastore/helm-charts/issues/54

jonathon2nd avatar Dec 19 '24 17:12 jonathon2nd

@jonathon2nd kubectl patch crd x.groupsnapshot.storage.k8s.io --subresource='status' --type='merge' -p '{"status":{"storedVersions": ["v1beta1"]}}' where x is each of volumegroupsnapshots, volumegroupsnapshotclasses, and volumegroupsnapshotcontents

pl4nty avatar Dec 21 '24 01:12 pl4nty