kubernetes-csi-addons icon indicating copy to clipboard operation
kubernetes-csi-addons copied to clipboard

Add volume group replication controller code

Open Nikhil-Ladha opened this issue 1 year ago • 5 comments

This PR adds the following codes/logic:

  • Added volume group replication controller logic
  • Added generated crds, rbacs
  • Added docs to create VGR, VGRClass and VGRContent CRs

Nikhil-Ladha avatar Jul 08 '24 12:07 Nikhil-Ladha

The changes are not yet tested, will wait for #605 to get merged and then update a few things and test it. In the meantime, please feel free to provide some initial reviews. Thanks :)

Nikhil-Ladha avatar Jul 08 '24 12:07 Nikhil-Ladha

Can you include some testing results in a PR-comment?

Sure, I can add the creation of the CRs and the group handle creation. The grouping still fails becuase the backend is still not ready, waiting for https://github.com/ceph/ceph-csi/pull/4739 to be merged and the omap generation process to be complete.

Nikhil-Ladha avatar Aug 02 '24 08:08 Nikhil-Ladha

Looks very promising to me, thanks!

Can you include some testing results in a PR-comment?

Here's the result of creating the VGR:

apiVersion: v1
items:
- apiVersion: replication.storage.openshift.io/v1alpha1
  kind: VolumeGroupReplication
  metadata:
    annotations:
      pvcSelector: group=replication
      replication.storage.openshift.io/volume-replication-name: vr-c4e68900-e0c5-465b-802a-c6fd011ff40c
      replication.storage.openshift.io/volumegroupreplication-content-name: vgrcontent-c4e68900-e0c5-465b-802a-c6fd011ff40c
    creationTimestamp: "2024-08-08T05:53:25Z"
    finalizers:
    - replication.storage.openshift.io/vgr-protection
    generation: 3
    labels:
      app.kubernetes.io/created-by: kubernetes-csi-addons
      app.kubernetes.io/instance: volumegroupreplication-sample
      app.kubernetes.io/managed-by: kustomize
      app.kubernetes.io/name: volumegroupreplication
      app.kubernetes.io/part-of: kubernetes-csi-addons
    name: volumegroupreplication-sample
    namespace: rook-ceph
    resourceVersion: "193975"
    uid: c4e68900-e0c5-465b-802a-c6fd011ff40c
  spec:
    autoResync: false
    replicationState: primary
    source:
      selector:
        matchLabels:
          group: replication
    volumeGroupReplicationClassName: volumegroupreplicationclass-sample
    volumeGroupReplicationContentName: vgrcontent-c4e68900-e0c5-465b-802a-c6fd011ff40c
    volumeReplicationClassName: rbd-volumereplicationclass
    volumeReplicationName: vr-c4e68900-e0c5-465b-802a-c6fd011ff40c
  status:
    conditions:
    - lastTransitionTime: "2024-08-08T05:53:26Z"
      message: ""
      observedGeneration: 1
      reason: FailedToPromote
      status: "False"
      type: Completed
    - lastTransitionTime: "2024-08-08T05:53:26Z"
      message: ""
      observedGeneration: 1
      reason: Error
      status: "True"
      type: Degraded
    - lastTransitionTime: "2024-08-08T05:53:26Z"
      message: ""
      observedGeneration: 1
      reason: NotResyncing
      status: "False"
      type: Resyncing
    message: 'volume 0001-0009-rook-ceph-0000000000000002-104cfb18-4ea6-4916-b301-94401d466e34
      not found: Failed as image not found (internal RBD image not found)'
    observedGeneration: 1
    persistentVolumeClaimsRefList:
    - name: test-pvc
    state: Unknown

The CR is created and the dependend CRs i.e, VR and VGRContent created as well

apiVersion: v1
items:
- apiVersion: replication.storage.openshift.io/v1alpha1
  kind: VolumeGroupReplicationContent
  metadata:
    annotations:
      replication.storage.openshift.io/volumegroupref: volumegroupreplication-sample/rook-ceph
    creationTimestamp: "2024-08-08T05:53:25Z"
    finalizers:
    - replication.storage.openshift.io/vgr-protection
    generation: 2
    name: vgrcontent-c4e68900-e0c5-465b-802a-c6fd011ff40c
    resourceVersion: "193969"
    uid: 10c4a992-107b-49ad-aa77-6bc989f42184
  spec:
    provisioner: rook-ceph.rbd.csi.ceph.com
    source:
      volumeHandles:
      - 0001-0009-rook-ceph-0000000000000002-6526bb74-044a-489f-a821-7d8c042719b4
    volumeGroupReplicationClassName: volumegroupreplicationclass-sample
    volumeGroupReplicationHandle: 0001-0009-rook-ceph-0000000000000002-104cfb18-4ea6-4916-b301-94401d466e34
    volumeGroupReplicationRef:
      apiVersion: replication.storage.openshift.io/v1alpha1
      kind: VolumeGroupReplication
      name: volumegroupreplication-sample
      namespace: rook-ceph
      uid: c4e68900-e0c5-465b-802a-c6fd011ff40c
  status:
    persistentVolumeRefList:
    - name: pvc-7228cbb8-7a2f-4daf-bea4-4dabbd2ecabe
apiVersion: v1
items:
- apiVersion: replication.storage.openshift.io/v1alpha1
  kind: VolumeReplication
  metadata:
    annotations:
      replication.storage.openshift.io/volumegroupref: volumegroupreplication-sample/rook-ceph
    creationTimestamp: "2024-08-08T05:53:26Z"
    finalizers:
    - replication.storage.openshift.io
    generation: 1
    name: vr-c4e68900-e0c5-465b-802a-c6fd011ff40c
    namespace: rook-ceph
    ownerReferences:
    - apiVersion: replication.storage.openshift.io/v1alpha1
      kind: VolumeGroupReplication
      name: volumegroupreplication-sample
      uid: c4e68900-e0c5-465b-802a-c6fd011ff40c
    resourceVersion: "193974"
    uid: 2f1f45bf-417b-41ad-9973-71741dda555d
  spec:
    autoResync: false
    dataSource:
      apiGroup: replication.storage.openshift.io
      kind: VolumeGroupReplication
      name: volumegroupreplication-sample
    replicationHandle: ""
    replicationState: primary
    volumeReplicationClass: rbd-volumereplicationclass
  status:
    conditions:
    - lastTransitionTime: "2024-08-08T05:53:26Z"
      message: ""
      observedGeneration: 1
      reason: FailedToPromote
      status: "False"
      type: Completed
    - lastTransitionTime: "2024-08-08T05:53:26Z"
      message: ""
      observedGeneration: 1
      reason: Error
      status: "True"
      type: Degraded
    - lastTransitionTime: "2024-08-08T05:53:26Z"
      message: ""
      observedGeneration: 1
      reason: NotResyncing
      status: "False"
      type: Resyncing
    message: 'volume 0001-0009-rook-ceph-0000000000000002-104cfb18-4ea6-4916-b301-94401d466e34
      not found: Failed as image not found (internal RBD image not found)'
    observedGeneration: 1
    state: Unknown

Nikhil-Ladha avatar Aug 08 '24 06:08 Nikhil-Ladha

There are a few minor conflicts that need to be resolved.

Is there a way to set the status.conditions[*].message instead, or in addition to the single (last?) status.message?

nixpanic avatar Aug 27 '24 08:08 nixpanic

There are a few minor conflicts that need to be resolved.

Resolved the conflicts.

Is there a way to set the status.conditions[*].message instead, or in addition to the single (last?) status.message?

Well, we are just copying the status from VolumeReplication CR, so I am not sure if there is any way we can change that other than updating the VolumeReplication CR :/

Nikhil-Ladha avatar Aug 27 '24 10:08 Nikhil-Ladha

Can we please have second round of review for the PR? cc @nixpanic @Rakshith-R @Madhu-1

Nikhil-Ladha avatar Mar 19 '25 16:03 Nikhil-Ladha

@Madhu-1 addressed all your suggestions, please take a look now.

Nikhil-Ladha avatar Mar 21 '25 10:03 Nikhil-Ladha

@nixpanic @Madhu-1 can either of you please do another approval (/review)?

Nikhil-Ladha avatar Apr 01 '25 12:04 Nikhil-Ladha

Looks good, but need to resolve conflicts. Mainly the updated golangci-lint version does not like referencing parent structs like r.Client.Get() when r.Get() is the same, or pvc.ObjectMeta.Name when you can use the shorter pvc.Name.

Resolved those, please take a look now.

Nikhil-Ladha avatar Apr 01 '25 15:04 Nikhil-Ladha