helm-charts icon indicating copy to clipboard operation
helm-charts copied to clipboard

Cannot support existingClaim and different pvc for different vmstorage pods at the same time

Open chenlujjj opened this issue 1 year ago • 4 comments

Hi Team, I'm trying to setup two VictoriaMetrics clusters in two GCP zones separately. For the primary cluster, I set existingClaim to empty string to let the statefulset use volumeClaimTemplates. For the secondary cluster which is used in case of the primary zone is down, I try to make it use the PVC and PV of the primary cluster, but it seems impossible now. If I set existingClaim, all vmstorage pods will use the same PVC, that is not what I want.

chenlujjj avatar Mar 20 '24 16:03 chenlujjj

Hello,

I try to make it use the PVC and PV of the primary cluster, but it seems impossible now. If I set existingClaim, all vmstorage pods will use the same PVC, that is not what I want.

You want the secondary cluster to use the PVC and PV of the primary cluster, then you don't want them to use the same PVC. Got me confused, could you elaborate?

Haleygo avatar Mar 21 '24 07:03 Haleygo

Sorry for the confusion. Say the primary cluster has 4 vmstorage pods, and their PVC are vmstorage-0, vmstorage-1, vmstorage-2, vmstorage-3. Then I want the secondary cluster's vmstorage pods use these PVC respectively. If I set existingClaim to vmstorage-0, all four pods are using the same PVC which is not I want.

chenlujjj avatar Mar 21 '24 09:03 chenlujjj

I got a workaround for this issue, though a bit ugly.

  1. I updated this line to:
name: {{ .Values.vmstorage.persistentVolume.name | default "vmstorage-volume" }}

so that I can customize the PVC name in the values.yaml.

  1. For the primary cluster, I set the vmstorage.persistentVolume.name to vmstorage-volume-standby, so the actual PVC name is vmstorage-volume-standby-victoria-metrics-cluster-vmstorage-{0/1/2/3}

  2. For the secondary cluster names as standby-victoria-metrics-cluster, I set the vmstorage.persistentVolume.name to vmstorage-volume, so the PVC name will be the same as the primary cluster's respectively

chenlujjj avatar Mar 21 '24 11:03 chenlujjj

The reason I try to reuse the PVCs is that I want to use GCP regional persistent disk to improve availability

chenlujjj avatar Mar 21 '24 11:03 chenlujjj

Oh, okay, I'm afraid you can't do that, two vmstorage process can't use the same volume as their storage. First of all, vmstorage will try to create a flock.lock file under "/vm-data" to make sure it's the only process has exclusive access to "/vm-data", and will panic if it can't acquire the lock. Even if we remove this check, normally you shouldn't have two different processes to write&read the same volume, it causes data duplicated and conflicted. And if one process causes data corruption, you lose two of them, then there is no sense to call it HA.

Haleygo avatar Mar 22 '24 03:03 Haleygo

@Haleygo Thanks for your reply. I understand your concern.

I should have pointed out that the secondary cluster will not write&read to the same volume with the primary cluster at the same time, it is only for disaster recovery. In other words, the secondary cluster will be started when we find the primary cluster is unavailable and not working anymore.

chenlujjj avatar Mar 22 '24 06:03 chenlujjj

So you have two different available zone or region, the first vmcluster deployed on zoneA, second one on zoneB and components on zoneB are all stopped(replicas=0) at first. If zoneA is completely down, you can increase zoneB vmcluster's replicas(it's the same as creating a new vmcluster) and use zoneB. But if zoneA is not completely down, just several vmstorage nodes are down, e.g storage0 and storage2, in this case, you can't use zoneB to help recovering, since you will get double storage1 and storage3. Am I understand correctly?

If so, then every method that generates same persistentVolume name for different vmcluster will work, like yours here. And I don't see how we can improve chart for this particular case.

Haleygo avatar Mar 22 '24 07:03 Haleygo

Correct, that's what I mean.

chenlujjj avatar Mar 22 '24 07:03 chenlujjj

Maybe I can raise a PR to make the vmstorage.persistentVolume.name customizable so that I don't need to host the chart separately? Thanks

chenlujjj avatar Mar 22 '24 07:03 chenlujjj

Maybe I can raise a PR to make the vmstorage.persistentVolume.name customizable so that I don't need to host the chart separately? Thanks

yeah, feel free)

You can also try using the k8s stack chart, VMClusterSpec.VMStorage.storage supports customized VolumeClaimTemplate name as well.

Haleygo avatar Mar 22 '24 16:03 Haleygo

I created a PR https://github.com/VictoriaMetrics/helm-charts/pull/939

chenlujjj avatar Mar 24 '24 14:03 chenlujjj

The request has been released in victoria-metrics-cluster-0.11.14, close as completed.

Haleygo avatar Mar 29 '24 09:03 Haleygo

Hi @chenlujjj, @Haleygo I have a similar use case. While I can successfully configure a custom name for volumeClaimTemplates using vmstorage.persistentVolume.name, there's an inconsistency in how this name is applied throughout the rendered manifest.

Using the latest helm chart, I'm able to configure a custom name for volumeClaimTemplates, and the volume name and PVC names are changed as expected in the rendered vmstorage StatefulSet pod manifest. However, the volume name reference in volumeMounts is not updated and still uses the default vmstorage-volume.

For example, in values, if I configure vmstorage.persistentVolume.name: restored-vmstorage-volume, the rendered volumeClaimTemplate section in StatefulSet is correct as follows:

  volumeClaimTemplates:
    - metadata:
        name: restored-vmstorage-volume

Consequently, in the provisioned pod, the volume is configured with the name restored-vmstorage-volume as expected:

  volumes:
  - name: restored-vmstorage-volume
    persistentVolumeClaim:
      claimName: restored-vmstorage-volume-vmstorage-0

However, the volumeMounts section still references the old name:

    volumeMounts:
    - mountPath: /storage
      name: vmstorage-volume

This causes the deployment to fail with a "volume vmstorage-volume not found" error.

The issue could be resolved by updating the volumeMounts section to use the custom name specified in vmstorage.persistentVolume.name, which is restored-vmstorage-volume in this example.

srinusanduri avatar Jul 31 '24 06:07 srinusanduri

@Haleygo Created this issue to resolve this.

srinusanduri avatar Jul 31 '24 07:07 srinusanduri