piraeus-operator
piraeus-operator copied to clipboard
Scheduling extender plugin for LINSTOR
Follows up https://github.com/piraeusdatastore/piraeus-operator/pull/219#issuecomment-1027282815
Yeah, now I understand why you decided to get rid of stork. It's quite huge, slow, contains a lot of complex controllers and requires ultimate permissions on cluster and does not support secure communication with kube-scheduller. As a result, it is slowly updated.
Nevertheless, I found that stork scheduler extender component does not require such maintenance.
Eg. it is working fine with Kubernetes ±1.22. Only config format for kube-scheduler was changed.
I'm still convinced in my point. We shouldn't use Kubernetes CSI topology feature until it can be dynamically updated after volume is created. So I made a decision to write separate linstor scheduler extender based on existing stork driver, and run it as sidecar to the kube-scheduler pod.
By this I solved following problems:
- Now we have independed linstor-scheduller with small amount of code and superfluous controllers/CRDs/whatever
- Permissions are reduced to same amount as Kube-sheduler have.
- Secure communication over localhost instead of sending raw HTTP requests over Kubernetes networking.
- The code is backward-compatible with current stork-driver.
Link to the project: https://github.com/kvaps/linstor-scheduler-extender Take a look at deploy folder, it has self-sufficient example.
Nice work. I'll try to take a closer look at the project soon. One of the things I'd like to make sure is that this project does:
- Work with topology, if enabled. I don't want to lock people into using only topology or only the scheduler plugin.
- Work with late binding. This is something that broke STORK when I last tried it. I.e. the scheduler gets called for a Pod which has an unbound volume claim, because the claim is waiting for the volume to be scheduled first.
I haven't looked at what is reused from the STORK plugin, but if it's not too much work I would like to not depend on Stork. That way, it is much easier to update.
Work with late binding. This is something that broke STORK when I last tried it. I.e. the scheduler gets called for a Pod which has an unbound volume claim, because the claim is waiting for the volume to be scheduled first.
Works as a charm. Linstor-scheduler just provides similar ranks for all the nodes for the first binding, letting Kubernetes to decide.
Work with topology, if enabled. I don't want to lock people into using only topology or only the scheduler plugin.
Not tested yet, but I think should works as well, the only change is that PV has hardcoded nodeAffinity.
I haven't looked at what is reused from the STORK plugin, but if it's not too much work I would like to not depend on Stork. That way, it is much easier to update.
Of course I wanted to start the new project using k8s-scheduler-extender-example or to simple disassemble stork libs to get rid of all it's parts. But I have no much time for doing this. The sheduler-extender interface is stable and do not change for ages, I decided to not touch what is already works fine.
The main goal was to get rid of all the moving parts (CRDs and all these fancy STORK controllers - we don't need them), thus I reused the single stupid thing: github.com/libopenstorage/stork/pkg/extender.
In fact the current implementation allows to drop-in the stork driver, but can be rewritten at any time. Since API will not change at all, it can be done at any time in the future versions of linstor-scheduler-extender
only change is that PV has hardcoded nodeAffinity.
BTW, not really related to the issue at hand, but have you looked into the "advanced" access policy patterns: https://linbit.com/drbd-user-guide/linstor-guide-1_0-en/#s-kubernetes-params-allow-remote-volume-access
Using them makes the nodeAffinity in the PV less (or more) strict, if desired. It's still not as dynamic as a custom scheduler config, but at least it can decouple PV from bare hostnames
So, we had a bit of an internal discussion, and the consensus is: yes, we want to support such a scheduler plugin.
@kvaps in https://github.com/piraeusdatastore/piraeus-operator/pull/269#issuecomment-1038971113 you indicated you would be willing to contribute your existing work to piraeus. Did I understand that right?
If so, I need to figure out the proper steps for that. Being part of the CNCF, it might not be as straight forward as forking the repo.
Sure. Let's do that.
I'm already active CNCF contributor and signed CLA as well. Guide me for the further steps, please.
Moving forward with this, I'd say we fork your existing repo (we can request "detaching" the fork later, if needed).
After we set up all the images, we add a new chart which deploys the image to https://github.com/piraeusdatastore/helm-charts
Then we can think about adding it as a (optional) dependency to the operator.
@kvaps https://github.com/deckhouse/linstor-scheduler-extender thats the right repo to fork, right?
https://github.com/deckhouse/linstor-scheduler-extender thats the right repo to fork, right?
Both repos are identical.
Hey, I can try to change owner of my repo. Which github organization should I choose, piraeusdatastore or linbit?
Please change it to piraeusdatastore
No way:
You don’t have the permission to create public repositories on piraeusdatastore
It seems I need to be a member of organization first :)
Just realized I'm not an owner in this org, so I can't add you, either :upside_down_face:
If you give me permissions to your own repo, I can initiate the transfer, since I can create public repos
Okay, invite has been sent
Still had to fork it. maybe only the owner can trigger the transfer? :shrug:
https://github.com/piraeusdatastore/linstor-scheduler-extender
I'll see about detaching the fork and adding you as a collaborator
Yeah, you have to write github support for that. And make a first release please
It might be important to change module name first: s/kvaps/piraeusdatastore/g
https://github.com/piraeusdatastore/linstor-scheduler-extender/blob/0136a32bb596483bc8ce6843c0bb9cb4893d9f91/go.mod#L1
I've invited you to the repo. Changes are here, please review: https://github.com/piraeusdatastore/linstor-scheduler-extender/pull/1
Detach already went through :)
Just realized I'm not an owner in this org, so I can't add you, either :upside_down_face:
That's sad. I would love to have one more organization bage on my GitHub profile and feel like part of the project.
Maybe that fact would even improve the quality of my contributions 😉