public-cloud-roadmap icon indicating copy to clipboard operation
public-cloud-roadmap copied to clipboard

RWX Persistent storage / FSaaS integration

Open mhurtrel opened this issue 4 years ago • 29 comments

As a MKS user I want to be able to share persistent volumes across multiple worker nodes to ensure compatibility with most demanding stateful scenarii

Notes : We are working with our IaaS storage colleagues to support the upcoming FSaaS service as RWX Kubernetes storage. This feature may be open in a selection of regions. A workaround is to manually configure NAS-HA volumes https://docs.ovh.com/gb/en/kubernetes/Configuring-multi-attach-persistent-volumes-with-ovhcloud-nas-ha/

mhurtrel avatar Oct 22 '20 19:10 mhurtrel

This feature is linked to https://github.com/ovh/public-cloud-roadmap/issues/54

mhurtrel avatar Nov 25 '20 10:11 mhurtrel

Hello, is this feature got a approximate delivery date for 2021 ?

the workaround is interesting but it would be a great way to make the cluster easyer to manage.

RomainPhil avatar Jan 20 '21 15:01 RomainPhil

Hi @RomainPhil . Yes : My storage colleagues target spring 2021. We will give more precise ETA as soon as we are able to :)

mhurtrel avatar Jan 20 '21 19:01 mhurtrel

Hi @mhurtrel ,

The problem with the NAS-HA is that, appart from the storage, all resources are shared (RAM/CPU/bandwidth). See here: https://docs.ovh.com/gb/en/storage/faq-nas/

This can cause some troubles when using time-critical stateful apps.

As an example, I attempted to use a NFS volume for a PostgreSQL database (bitnami/postgres). When the database starts and launches the initialisation process, it takes so much time that Kubernetes believes there is a problem and kills the PostgreSQL container. This makes the initialisation process incomplete and leads to a non-working database. This can be solved by increasing the initialDelaySeconds value, but it's not a real good solution.

The only trick I found to make it working so far is to use Cinder storage and stick all the storage-requesting pods to a particular node using nodeSelector. But as we have more and more apps deployed, this makes too much services running on a single node.

I know it's a bad practice running databases in a k8s cluster, but I guess this is one of the stateful scenarii you imagined?

Do you think there is a way to allow a Cinder volume to be attached to multiple instances (read worker node, indeed), as a temporary solution until FSaaS comes? Or maybe it is not possible and I'm not aware of that, sorry for asking then ;)

jbod avatar Feb 12 '21 17:02 jbod

Thanks for this detailed feedback !

I confirm that we work on 3 differents tracks:

  • The most adequate one is FSaaS that is targeted for late spring
  • A second one is multi attach cinder disks. I see various feedbacks from users of this approzch showing is can come with limitations if you have 2 workers nodes trying to write data at the same time so I anticipate this could be a problem with db.
  • my nas colleagues are planning for nas with improved performance

Before this late Spring release ly best advice would be to keep this db outside ofbthe cluster, running it on vms for example, or using the workaround you suggest.

mhurtrel avatar Feb 13 '21 07:02 mhurtrel

Hi @mhurtrel is there any update on the issue. Spring is almost over. Are you going to introduce a new storage class for it or will it work with csi-cinder-high-speed/classic out of the box?

I just run into the problem by some helm charts requiring RWX without much flexibility to make it work using RWO.

ZuSe avatar May 19 '21 13:05 ZuSe

Hello @ZuSe unfortunately my storage colleagues have delayed the fsaas for a couple of months, and I am waiting for a new eta whoch should be somewhere during summer. I reasonably hope we will ve able to ibtegrate it late summer.

mhurtrel avatar May 19 '21 13:05 mhurtrel

The FSaaS this features rely on has been postponed to early 2022. Sorry about the delay. The workaround described above still can help in the meantime.

mhurtrel avatar Jun 28 '21 15:06 mhurtrel

Hi @mhurtrel any news on this feature ? This looks to meet our needs and since we are in early 2022 I hoped that you could give a more precise ETA ? I don't think I should bother investigate the workaround if the feature is coming in a few months time.

elavrom avatar Feb 05 '22 13:02 elavrom

Hello. I dont have a new short term ETA unfortunaely but will work with my Storage colleagues to obtain this info ASAP

mhurtrel avatar Feb 11 '22 17:02 mhurtrel

are there any updates?

usergit avatar Mar 03 '22 22:03 usergit

It would be great if we could have some estimated timeframe. Thanks

matmicro avatar Mar 21 '22 12:03 matmicro

Current ETA is late this calendar year

mhurtrel avatar Mar 21 '22 12:03 mhurtrel

push

genjudev avatar Jun 10 '22 10:06 genjudev

Hello ! I confirm this is still a priority for us, and will come as soon as https://github.com/ovh/public-cloud-roadmap/issues/54 is made available by our storage colleagues. Sorry for the delay

mhurtrel avatar Jun 10 '22 10:06 mhurtrel

Thanks for keeping us in the loop, this can be a very powerful feature for a lot of our use cases.

usergit avatar Jun 13 '22 06:06 usergit

Is there a new ETA or is it actually coming this year? It would be a nice gift for christmas :)

atino avatar Dec 19 '22 09:12 atino

According to the last update of the closed ticket, It's linked to #305 which was put in coming soon on their roadmap (~2 months, last week): https://github.com/ovh/public-cloud-roadmap/projects/1#card-87623617

Beanux avatar Jan 26 '23 10:01 Beanux

Hey there #305 has been released and closed, any update regarding this issue? Is it still blocked by #54 ? Thanks

bh42 avatar Jun 22 '23 10:06 bh42

Yes indeed, the dependancy is on #54

mhurtrel avatar Jun 22 '23 10:06 mhurtrel

After fighting some time, I found this project that do the job (creation of an internal NFS server that allow to share currently non-shareable PVCs from OVH via an Internal Storage Class using NFS): https://github.com/kubernetes-sigs/nfs-ganesha-server-and-external-provisioner

zguig52 avatar Jun 23 '23 18:06 zguig52

After fighting some time, I found this project that do the job (creation of an internal NFS server that allow to share currently non-shareable PVCs from OVH via an Internal Storage Class using NFS): https://github.com/kubernetes-sigs/nfs-ganesha-server-and-external-provisioner

Keep in mind that this quick and easy provisionner rely on a PVC that rely on a single disk. If this disk crashes, you'll loose your data. For a more robust by heavier solution, check out Rooks, OpenEBS or alternatives.

danvy avatar Jun 24 '23 05:06 danvy

It's been a very long time waiting for RWX, and NAS with NFS workaround is not a production grade solution. All project have to be rethink with the RWO constraints, it's a pain in the ass. Maybe it's time to put a highest priority on it and invest some energy.

gsontag avatar Jun 25 '23 09:06 gsontag

@gsontag it remains a top priority, and we have been making sure ou storage colleagues build a solution that allows cloud native use cases. They hace been dedicated multiple peopoe to this for a few quarters, and are looking for a release in the next semester hopefully (see #54 ) As soon as they release this cloud-native compatible filestorage, we will prioritize the integration to managed kubernetes as our main focus.

mhurtrel avatar Jun 25 '23 10:06 mhurtrel

Also looking for ReadWriteMany compatibility...

monsty avatar Dec 01 '23 18:12 monsty

Any news about this feature ?

sevenGroupFrance avatar Dec 07 '23 16:12 sevenGroupFrance

Hello @sevenGroupFrance I confirm our storage colleagues are actively developping the FSaaS prerequisites : https://github.com/ovh/public-cloud-roadmap/issues/54 .

We raised our requirements and both teams have identified the associated effort (both on the FSaaS itslef and the CCM part), so that both will be adressed during spring. we epext a GA before summer and will update this as we approach the release.

mhurtrel avatar Dec 08 '23 10:12 mhurtrel