seldon-core icon indicating copy to clipboard operation
seldon-core copied to clipboard

Add support for passing nodeSelector & tolerations to Orchestrator

Open hjilke opened this issue 2 years ago • 8 comments

What this PR does / why we need it:

Adding support for passing nodeSelector & tolerations to the deployment of the Seldon orchestrator. Hopefully, this is something useful.

Which issue(s) this PR fixes:

Fixes #3852

Special notes for your reviewer:

hjilke avatar Sep 13 '22 15:09 hjilke

Hi @hjilke. Thanks for your PR.

I'm waiting for a SeldonIO or todo member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository.

seldondev avatar Sep 13 '22 15:09 seldondev

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: To complete the pull request process, please assign adriangonz You can assign the PR to them by writing /assign @adriangonz in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

seldondev avatar Sep 13 '22 15:09 seldondev

/assign @adriangonz

hjilke avatar Sep 13 '22 15:09 hjilke

Thank you for the contribution @hjilke - the way that we actually create our Helm charts is actually through a process that converts the "source-of-truth" which is our kustomize objects under operator/config into the helm charts using a python script under operator/helm/split_resources.py. If you wish to add any new fields to the helm chart you will need to do it through that script, as opposed to directly into the file

axsaucedo avatar Sep 14 '22 09:09 axsaucedo

Ok cool, let me have a look tomorrow!

hjilke avatar Sep 14 '22 16:09 hjilke

Hey @axsaucedo,

I had a quick look at operator/config. Would it be ok to add default values here e.g. tolerations: [] and nodeSelector: {}. And then overwrite them like here

hjilke avatar Sep 15 '22 15:09 hjilke

@hjilke yes that sounds reasonable in principle, let us know if you get any issues

axsaucedo avatar Sep 16 '22 14:09 axsaucedo

Hey @axsaucedo,

I updated my PR according to my suggestions above. I was a bit unsure about modifying values.yaml. Anyways, just let me know what you think.

hjilke avatar Sep 18 '22 16:09 hjilke

Any feedback @axsaucedo?

hjilke avatar Sep 23 '22 12:09 hjilke

Thank you for the update @hjilke - I can see that now the command does run and update the selector and tolerations. You will just need to run it for this to work correctly, namely make -C operator/helm create - however I can see that it's also modifying the webhooks.yaml, I believe this shoudl have been fixed in a previous PR, but if you can re-submit the changes without the wehbook.yaml changes then we should be able to merge.

Thank you very much for your contribution!

image

axsaucedo avatar Sep 23 '22 14:09 axsaucedo

@axsaucedo done! Thanks for the feedback. Let me know, if there are any further changes required

hjilke avatar Sep 25 '22 15:09 hjilke

@hjilke unfortunately it seems that there are still some issues, but almost there - the issue is that currently the value is being set as a string, as opposed to an array:

image

You can test the local install with:

helm upgrade --install seldon-core helm-charts/seldon-core-operator/ --namespace seldon-system # --set istio.enabled="true" --set istio.gateway="seldon-gateway.istio-system.svc.cluster.local"

axsaucedo avatar Sep 26 '22 08:09 axsaucedo

Ok, perfect. I will update the PR tomorrow!

hjilke avatar Oct 05 '22 15:10 hjilke

Hey @axsaucedo I could fix the issues with the type of tolerations. However, I still face:

 # helm upgrade --install seldon-core helm-charts/seldon-core-operator/ --namespace seldon-system 

Release "seldon-core" does not exist. Installing it now.
Error: unable to build kubernetes objects from release manifest: error validating "": error validating data: ValidationError(Deployment.spec.template.spec.nodeSelector): invalid type for io.k8s.api.core.v1.PodSpec.nodeSelector: got "string", expected "map"

I understand the message, but do you have any suggestions how to fix that?

hjilke avatar Oct 06 '22 18:10 hjilke

@hjilke certainly - this would require removing the quotation marks, although it may require extending the way that the script performs this replace, you could have a look at other examples where the resulting replacements are a map that doesn't get added with strings

axsaucedo avatar Oct 13 '22 13:10 axsaucedo

@axsaucedo for both I had already quotations marks removed. What seems to work now is parsing the nodeSelector as yaml in deployment_seldon-controller-manager.yaml.

Now installing the charts works.

hjilke avatar Oct 15 '22 12:10 hjilke

@axsaucedo any feedback?

hjilke avatar Oct 22 '22 10:10 hjilke

Sorry, I am spending my free time here and review takes more than 23 days. I dont think that is fair. Closing PR now.

hjilke avatar Nov 14 '22 13:11 hjilke

That's fair @hjilke, apologies as it's been quite busy at the Seldon side - we'll pick up on this PR / branch from this efforts and fix the final requests made above, your efforst are very much appreciated.

axsaucedo avatar Nov 14 '22 13:11 axsaucedo