celeborn icon indicating copy to clipboard operation
celeborn copied to clipboard

[CELEBORN-1528][HELM] Use volumeClaimTemplates to support various storage backend

Open ChenYi015 opened this issue 1 year ago • 5 comments

What changes were proposed in this pull request?

Make helm chart more customizable:

  • All master related configurations are prefixed with master, e.g. master.replicas, master.volumeClaimTemplates, master.volumeMounts.

  • All worker related configurations are prefixed with worker, e.g. worker.replicas, worker.nodeSelector, worker.affinity.

  • Add support for labels, env, envFrom and volumeClaimTemplates.

  • Bump chart appVersion to 0.5.1.

Why are the changes needed?

By using volumeClaimTemplates in StatefulSet, we can support various types of storage backend.

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

Test locally.

ChenYi015 avatar Jul 30 '24 08:07 ChenYi015

@pan3793 @RexXiong PTAL, thanks.

ChenYi015 avatar Jul 30 '24 12:07 ChenYi015

To make it can be reviewed and merged easily, I had created a new PR #2654 to do some values renaming. After it has been merged, I will do a rebase.

ChenYi015 avatar Jul 31 '24 11:07 ChenYi015

For my opinion, volumeClaimTemplates is needed but host path and empty path is more general for most of situations. It's better to add it as an option instead of the only choice.

lianneli avatar Aug 08 '24 09:08 lianneli

For my opinion, volumeClaimTemplates is needed but host path and empty path is more general for most of situations. It's better to add it as an option instead of the only choice.

@lianneli Thanks for the advice. Though, I think volumes with type hostPath and emptyDir are still supported by [master|worker].volumes. Before, the configuration is like this:

volumes:
  master:
  - mountPath: /mnt/celeborn_ratis
    hostPath: /mnt/celeborn_ratis
    type: hostPath
    capacity: 100Gi
  worker:
  - mountPath: /mnt/disk1
    hostPath: /mnt/disk1
    type: hostPath
    diskType: SSD
    capacity: 100Gi

After, it will be like this:

master:
  volumes:
  - name: celeborn-ratis
    hostPath: 
      path: /mnt/celeborn_ratis
  volumeMounts:
  - name: celeborn-ratis
    mountPath: /mnt/celeborn_ratis

worker:
  volumes:
  - name: disk1
    hostPath:
      path: /mnt/disk1
  volumeMounts:
  - name: disk1
    mountPath: /mnt/disk1

celeborn:
  celeborn.worker.storage.dirs=/mnt/disk1:disktype=SSD:capacity=100Gi

The difference is that we need to configure celeborn.worker.storage.dirs manually if we use hostPath or emptyDir.

ChenYi015 avatar Aug 08 '24 13:08 ChenYi015

If we want to use volumeClaimTemplates, suppose ssd-storage-class is a valid storage class in your Kubernetes cluster that provided by a cloud provider or defined by yourself that can be used to provide SSD cloud disks dynamically, the configuration will be like as follows:

master:
  volumeClaimTemplates:
  - metadata:
      name: celeborn-ratis
      annotations:
        celeborn.apache.org/disk-type: SSD
        celeborn.apache.org/disk-capacity: 100Gi
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        request:
          storage: 100Gi
        limits:
          storage: 100Gi
      storageClassName: ssd-storage-class
  volumeMounts:
  - name: celeborn-ratis
    mountPath: /mnt/celeborn_ratis

worker:
  volumeClaimTemplates:
  - metadata:
      name: disk1
      annotations:
        celeborn.apache.org/disk-type: SSD
        celeborn.apache.org/disk-capacity: 100Gi
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        request:
          storage: 100Gi
        limits:
          storage: 100Gi
      storageClassName: ssd-storage-class
  volumeMounts:
  - name: disk1
    mountPath: /mnt/disk1

The annotations celeborn.apache.org/disk-type and celeborn.apache.org/disk-capacity are used by Helm to render celeborn.worker.storage.dirs in configmap.

When we install the Helm chart, SSD cloud disks will be dynamically provisioned by cloud providers.

ChenYi015 avatar Aug 08 '24 13:08 ChenYi015

This PR is stale because it has been open 20 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions[bot] avatar Aug 29 '24 08:08 github-actions[bot]

This PR is stale because it has been open 20 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions[bot] avatar Sep 19 '24 08:09 github-actions[bot]

This issue was closed because it has been staled for 10 days with no activity.

github-actions[bot] avatar Sep 30 '24 08:09 github-actions[bot]

Hey @ChenYi015 thanks a lot for working on this MR!

I saw this MR is marked as closed. Could you mention a bit your plan on this MR further?

Asking this question because we currently are doing similar customisation (using volumeClaimTemplates) on the chart internally, and want to align with you so that changes are not far away from each other :D

zemin-piao avatar Nov 05 '24 21:11 zemin-piao

Hi @ChenYi015 , Do you plan to create a new PR to support volumeClaimTemplates?

nqvuong1998 avatar Apr 02 '25 03:04 nqvuong1998

Yes, I will reopen this or create a new PR recently just after refactoring the Helm chart CELEBORN-1951.

ChenYi015 avatar Apr 02 '25 03:04 ChenYi015