manifests icon indicating copy to clipboard operation
manifests copied to clipboard

proposal: Guidelines for /contrib components

Open kimwnasptd opened this issue 3 years ago • 8 comments

Create an initial proposal regarding the expectations we should have around the components in /contrib as well as a deprecation plan for unmaintained components.

I'd like to wait 2 weeks before merging this PR, to gather feedback on the proposal. Then once it's merged I'll do a pass on the components and create issues for the ones that don't meet the requirements. If there won't be any feedback in those issues, then the components will enter a deprecation phase.

@holdenk @Jeffwan @krishnadurai @yanniszark @juliusvonkohout @zijianjoy @gkcalat @zijianjoy @gkcalat @Jeffwan @woop @tedhtchang @Tomcli @animeshsingh @pvaneck @axsaucedo @adriangonz @cliveseldon @ryandawsonuk @ellistarn @yuzisun @cliveseldon @animeshsingh @deadeyegoodwin

cc-ing the release team as well @annajung @DomFleischmann @yubozhao @surajkota @DnPlas @jbottum

kimwnasptd avatar Sep 27 '22 10:09 kimwnasptd

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kimwnasptd

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

google-oss-prow[bot] avatar Sep 27 '22 10:09 google-oss-prow[bot]

/hold

kimwnasptd avatar Sep 27 '22 10:09 kimwnasptd

I think "The K8s version and KF component versions that the component needs" is not always needed and can be ued as excuse to not maintain stuff. We should expect also /contrib to work with the recent Kubeflow version. So if you remove everything that has not been touched for 2 years and is most likely broken, you get

  • Feast (must be updated or dropped for 1.6, a lot of effort)
  • kserve
  • metacontroller
  • networkpolicies
  • seldon (must be updated from 1.9 to a newer version that supports Kubernetes 1.22, should be simple https://github.com/kubeflow/manifests/issues/2290)

@sylus seems to have interest in upgrading seldon according to https://github.com/kubeflow/manifests/issues/2207#issuecomment-1253917266 and also @nithin8702 https://github.com/kubeflow/manifests/issues/2194#issuecomment-1253630855

juliusvonkohout avatar Sep 27 '22 12:09 juliusvonkohout

@kimwnasptd we have two related sessions at the Kubeflow Summit on Day 2, 1:00 pm - 1:30 pm - Manifests, and 1:30 pm - 2:00 pm - Components (How do we take care non core components). Would you please consider facilitating the Manifest session? I think this PR would be a good topic for these sessions and it would be great if you could review it in one of those two sessions.

jbottum avatar Oct 06 '22 15:10 jbottum

@kimwnasptd I like this proposal. Some comments in the Component Requirements section. Should we consider enhancing item 3. to include the need for a test plan and a verification that it has been executed for the latest release? Should there be an item 5 that would address how users might get support i.e. via a Kubeflow slack channel ?

jbottum avatar Oct 06 '22 15:10 jbottum

@jbottum

Would you please consider facilitating the Manifest session? I think this PR would be a good topic for these sessions and it would be great if you could review it in one of those two sessions

Yes, this can definitely be a topic to go through as well as how to make it easier for new components to integrate with Kubeflow.

Should we consider enhancing item 3. to include the need for a test plan and a verification that it has been executed for the latest release?

That's a pretty good suggestion actually, to expect a series of steps [can later become a script, that we can use to validate a component is working as expected. This also links to what @annajung proposed above https://github.com/kubeflow/manifests/pull/2286#discussion_r981484770

I'll add a new bullet for testing that asks maintainers to provide:

  1. A simple script that runs and ensures the component is working as expected. It can be something as simple as creating a CustomResource object and wait for it to become ready
  2. Work with the Manifests WG to ensure that there's some automation for that test

Should there be an item 5 that would address how users might get support i.e. via a Kubeflow slack channel ?

I think there's no need for this one. As long as the OWNERS file is up to date we can expect issues to be opened and we can ping the maintainers in these issues.

kimwnasptd avatar Oct 10 '22 15:10 kimwnasptd

At this point I believe we've addressed all the provided feedback. Also thank you everyone for your valuable input!

I think the documented requirements will really help ensure our components will be able to stand the test of time. Next work items here will be to expose more documentation on what are the actual technical details for a component to be "integrated" with the rest of the KF components. I.e.

  • Namespace isolation
  • User permissions on user-namespaces, for any new CustomResources that will be created
  • Servers that act on behalf of users will need to respect the user in the kubeflow-userid header, and use SubjectAccessReviews
  • Integration with the CentralDashboard
  • Possibility for a KFP component [if needed]

I'll keep this PR open for 2 more days to see if we'll have any last feedback and then I'll merge the PR. Thank you for your time everyone!

kimwnasptd avatar Oct 10 '22 16:10 kimwnasptd

At this point I believe we've addressed all the provided feedback. Also thank you everyone for your valuable input!

I think the documented requirements will really help ensure our components will be able to stand the test of time. Next work items here will be to expose more documentation on what are the actual technical details for a component to be "integrated" with the rest of the KF components. I.e.

* Namespace isolation

* User permissions on user-namespaces, for any new CustomResources that will be created

* Servers that act on behalf of users will need to respect the user in the `kubeflow-userid` header, and use SubjectAccessReviews

* Integration with the CentralDashboard

* Possibility for a KFP component [if needed]

I'll keep this PR open for 2 more days to see if we'll have any last feedback and then I'll merge the PR. Thank you for your time everyone!

Yes, definitely no root containers / runasnonroot. They must run with the offical kubernetes podsecuritystandards restricted set https://kubernetes.io/docs/concepts/security/pod-security-standards/#restricted. That is currently the case if you use istio-cni and all other official components. I worked over the last two years with almost all WGs to make this possible. Only this enables us to significantly harden Kubeflow in the future and make it ready for enterprise consumption.

juliusvonkohout avatar Oct 10 '22 16:10 juliusvonkohout

@kimwnasptd https://kubernetes.io/docs/concepts/workloads/pods/user-namespaces/ will be incredible exiting. I am still discussing with security researchers, but it could once and for all remove the runasnonroot problem. The PVC support will come in the future https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/127-user-namespaces/README.md#phase-2-pods-with-volumes. But then in around a year, when modern distributions are using that, we could at least consider runasnonroot exceptions e.g. for MLflow from infinstor in the Kubeflow profile namespaces. Maybe i will use kyverno then and add two policies into contrib: The offical restricted PSS and and another one which allows root (not privileged) and enforces hostUsers=false.

juliusvonkohout avatar Oct 19 '22 12:10 juliusvonkohout

At this point I believe we can merge this PR and start cleaning up the /contrib dir

/cc @kubeflow/wg-manifests-leads

/hold cancel

kimwnasptd avatar Oct 19 '22 16:10 kimwnasptd

/lgtm

elikatsis avatar Oct 19 '22 16:10 elikatsis