trident icon indicating copy to clipboard operation
trident copied to clipboard

parallel operation from the trident

Open peleg2312 opened this issue 7 months ago • 14 comments

Describe the bug i am using the trident with nas in version 24.10 i have an openshift space where it create a lot of pvc at the same time around 50 at the same time but it look like the trident can handle only one pvc at the time and create a volume for it and bound him what can i do to make it work more in parallel and faster?

peleg2312 avatar May 13 '25 18:05 peleg2312

Hello, @peleg2312. Not much you can do at present, but we're working on it!

clintonk avatar May 13 '25 19:05 clintonk

Can you give me maybe things to change to make it possible because it is very important to me that it will be the soonest possible, or if there a way to make the all trident faster

peleg2312 avatar May 13 '25 19:05 peleg2312

@peleg2312, it's a non-trivial problem. There are some operations that may be parallelized, and others that must be prevented. And we have to build in limiters as well, as you wouldn't want to create 50 volumes at the same time on the same storage system. With ontap-nas, a PVC should be bound in just a few seconds. If you are seeing much higher latencies on a busy storage system, please open a support case for that.

clintonk avatar May 14 '25 13:05 clintonk

It take around 8 sec to create and get PVC to bounded, but if I want around 250 PVC it take 22 min unlike operator like csi-powerflex that take a lot less because it have parallel operation. I am running really depending services and I know it but I still need some solution for this, why shouldn't I want to create 50 volumes at the same time, why there a limiter to 2 volume at a time a not just send it to workers like the normal csi provisioner and then it will be parallel, If you have idea to quickly change it in the code I will be happy

peleg2312 avatar May 15 '25 18:05 peleg2312

@peleg2312 Trident has always been single-threaded, and there is little you can do until we fix that. Please understand we are working on this at high priority, and a solution will likely roll out over multiple releases.

clintonk avatar May 28 '25 13:05 clintonk

Just as a status update and overview on the improvements already delivered with 26.06: https://community.netapp.com/t5/Tech-ONTAP-Blogs/Trident-Controller-Parallelism/ba-p/461918

wonderland avatar Aug 01 '25 07:08 wonderland

From my understanding it look like it only working for San now, when will it come out for nas?

peleg2312 avatar Aug 01 '25 09:08 peleg2312

Yes and no ;-) There is work on concurrent operations at the node level (such as mount/unmount) and at the controller level (such as API calls to Ontap).

For NAS, concurrent operations at the node level were added with 25.02 and I assume you would see improvements over the 24.10 version you had tested with. See https://github.com/NetApp/trident/blob/stable/v25.06/CHANGELOG.md?plain=1#L104

For the controller level, this is iSCSI/FCP initially, with other protocols following. Can't comment on roadmap here but Trident has a history of fast-paced innovation ;-)

wonderland avatar Aug 01 '25 12:08 wonderland

But what help me the concurrent at the node level for nas, not all the operations like attach and create are at the controller level? What the problem in the integration of concurrent for nas if it done with San, if it not too complicated I would even like to try myself

peleg2312 avatar Aug 01 '25 12:08 peleg2312

It depends on the case, e.g. draining a node will be different than mass-provisioning of new PVCs. No one said this is the final state, I just thought a short status update would be welcomed

As to why this is not trivial - the blog I linked above has more details on that. The complexity and amount of work make this a multi-release journey (and a tech preview feature initially)

wonderland avatar Aug 01 '25 12:08 wonderland

Okey I understand, so we will just have to wait for the release, another quick question there isn't a way to make iscsi PVC be read write many right?

peleg2312 avatar Aug 02 '25 11:08 peleg2312

Hi @peleg2312 Block & RWX has been supported with Trident for several years already (though not very common outside of KubeVirt configurations) Here is the link that shows you the matrix protocol vs access mode: https://docs.netapp.com/us-en/trident/trident-use/ontap-san.html#ontap-san-driver-details

the trick is to set the parameter: volumeMode: Block

hth

YvosOnTheHub avatar Aug 04 '25 06:08 YvosOnTheHub

In this situation how does it work because who Handel the file system and how the 2 pods doesn't destroy each other data?

peleg2312 avatar Aug 04 '25 07:08 peleg2312

this would be entirely handled by the higher layers (the app), as the volume is mounted as a Raw Block Device

in the case of KV, RWX is a requirement for live migration

YvosOnTheHub avatar Aug 04 '25 07:08 YvosOnTheHub