ceph-csi icon indicating copy to clipboard operation
ceph-csi copied to clipboard

Move topologyConstrainedPools to Beta status

Open sbskas opened this issue 3 years ago • 12 comments

Describe the feature you'd like to have

At this time, topologyContrainedPools has been in alpha stage for 2 years with very low noise. The feature is not documented but in the helm chart and its status might scares away lots of potential users.

What is the value to the end user? (why is it a priority?)

We're implementing a multi-zone cluster on on-premise bare metal. This cluster will be home of a few large elasticsearch clusters. We have 21 nodes in the cluster : 1- 3 masters on vms 2- 6 workers for applications 3- 12 workers for storage

All those nodes are spread on 3 datacenters connected with evpn fabrics. This feature will allow us to build statefulset and scale them with storage and pods evenly spread on the different datacenters using the topologySpreadConstraints.

We're doing rook deployment for our ceph cluster. We created 3 cephblockpools, one for each zone by changing the crushRoot to the zone root. Then we create a storageclass with corresponding topologyContrainedPools with matching zones. And finally we added the corresponding topologySpreadContrain to our statefulSet.

How will we know we have a good solution? (acceptance criteria)

Right now, the feature is ok but has to be enabled on the nodeplugin daemonset for it to work. Moving to beta then GA will make this activation optional and will have proper documentation for it. After discussion on Slack, it seems that this feature is to be completed but I failed to fully understand what's missing.

Additional context

One this feature in beta stage, the rook operator needs to have a way to enable it on its ressources.

sbskas avatar Jan 25 '22 17:01 sbskas

csi-provisionner doesn't seems to need '--feature-gate=Topology=True' anymore. According to the log : csi-provisioner W0126 10:05:23.447685 1 feature_gate.go:237] Setting GA feature gate Topology=true. It will be removed in a future release.

sbskas avatar Jan 26 '22 10:01 sbskas

Lets try to move this to Beta or supported state , we were tracking this effort under this https://github.com/ceph/ceph-csi/issues/2199 and most of it has been addressed, this is one of the last pending items... @sbskas

humblec avatar Feb 02 '22 08:02 humblec

@sbskas would you be able to help us to move this feature to BETA in next/upcoming release ? What that basically means is that, more testing , adding tests if its missing and also updating the documentation ? please let us know .. Really appreciate if we can lift the support to BETA in next release. we welcome contributions.

humblec avatar Feb 08 '22 09:02 humblec

as discussed in triage call, we could target this for 3.6 if we have a volunteer, otherwise we will move this out of the 3.6 rleease.

humblec avatar Feb 16 '22 06:02 humblec

@humblec I volonteer.

sbskas avatar Feb 17 '22 13:02 sbskas

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Mar 19 '22 21:03 github-actions[bot]

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

github-actions[bot] avatar Mar 26 '22 21:03 github-actions[bot]

@sbskas as mentioned in the release issue, we are planning to release 3.6 early next week..

As per your validation of this feature , is it good to qualify as Beta ? or we need more time to declare it as "beta" . please share your thought, accordingly we can reach consensus on support state

humblec avatar Apr 01 '22 11:04 humblec

@humblec we tested all the case we met on test platform. We fixed the obvious problem encountered (#2828 and #2925). According to me, the feature is ready to go to beta. I would have like something like #2962 for easier integration with rook however, I'll live with it.

sbskas avatar Apr 04 '22 14:04 sbskas

Thanks @sbskas for all the effort and conclusion on where we stand.

I am fine to call this Beta, however let me get atleast an ack from @ceph/ceph-csi-maintainers

humblec avatar Apr 04 '22 15:04 humblec

@Madhu-1 @nixpanic can you share your thought about moving this feature to Beta in this release (3.6) ?

humblec avatar Apr 05 '22 05:04 humblec

@Madhu-1 @nixpanic can you share your thought about moving this feature to Beta in this release (3.6) ?

IMO As we are at end of the release, Let's not hurry we can wait for #2962. based on that we can move to beta? if we are still in alpha we have a good chance to do some breaking changes if really needed.

Even if the feature is alpha we can work on the integration part.

Madhu-1 avatar Apr 05 '22 05:04 Madhu-1

All of this looks to have made is in release-3.7 quite a while ago. Please open a new issue if something isn't working as planned/expected.

nixpanic avatar Jul 20 '23 07:07 nixpanic