manifests
manifests copied to clipboard
[Kubeflow 1.9] Distributions and Kubeflow 1.9
This issue will be used to track the progress of and coordinate with distributions along the 1.9 release.
While we hope all distros will manage to be ready when the KF 1.9 release is out, this is sometimes difficult to achieve. In this issue, we want to both keep track of the progress of distributions towards the KF 1.9 release and also know which of the distros will be working on KF 1.9 (testing during the distribution testing cycle) even if they can't meet the KF 1.9 deadline.
Tagging distribution owners identified from previous releases (Any new or missed distro owners, please comment on this issue)
| Distribution | Representative(s) | State |
|---|---|---|
| AWS | @surajkota | not participating in 1.9 |
| Charmed Kubeflow | @DnPlas | participating in 1.9 |
| Google Cloud | @gkcalat @zijianjoy @Linchin |
not participating in 1.9 |
| IBM IKS | @Tomcli @yhwang |
participating in 1.9 |
| Microsoft | not participating in 1.9 | |
| Nutanix | @johnugeorge @nagar-ajay |
participating in 1.9 |
| Red Hat OpenShift AI | @rimolive | participating in 1.9 |
| Oracle Cloud Infrastructure | @julioo | not participating in 1.9 |
| DeployKF | @thesuperzapper | participating in 1.9 |
| VMWare | @liuqi @xujinheng |
participating in 1.9 |
| QBO | @alexeadem | participating in 1.9 |
Please let us know if you'll be participating in the 1.9 release by answering the following questions:
- Are you planning on having your distro ready in sync with the KF 1.9 release?
- Will you participate by testing your distro during the distribution testing phase and providing feedback (reporting any issues to the release team)?
- If you cannot participate, when can the community expect your distro to be ready for release 1.9?
Please note the release timelines are being discussed in kubeflow/manifests#2606.
cc @kubeflow/release-team @jbottum
@rimolive can you remove @DnPlas from Charmed Kubeflow and replace her with myself? ty!
to your questions, for Charmed Kubeflow:
- Are you planning on having your distro ready in sync with the KF 1.9 release?
- yes
- Will you participate by testing your distro during the distribution testing phase and providing feedback (reporting any issues to the release team)?
- yes
@rimolive deployKF will participate in 1.9, but it's not 100% clear exactly what that will look like.
Separately, given "Kubeflow on AWS" did not participate in 1.8, and announced they were no longer supporting their distribution in https://github.com/awslabs/kubeflow-manifests/issues/794, I think its unlikely they will do 1.9?
Given this, I proposed moving them to "legacy" on the Kubeflow website on this PR https://github.com/kubeflow/website/pull/3641.
However, I also want to avoid confusion with users, because they might think that Kubeflow no longer supports AWS due to the "Kubeflow on AWS" name. So I also think we should merge https://github.com/kubeflow/website/pull/3643 at the same time, which tells users that "Kubeflow on XXXX" is just a name, and NOT the ONLY way to use Kubeflow on that platform.
For IBM IKS:
Are you planning on having your distro ready in sync with the KF 1.9 release?
Yes
Will you participate by testing your distro during the distribution testing phase and providing feedback (reporting any issues to the release team)?
Yes
For VMware Distro:
Are you planning on having your distro ready in sync with the KF 1.9 release?
Yes
Will you participate by testing your distro during the distribution testing phase and providing feedback (reporting any issues to the release team)?
Yes
For QBO Distro:
Are you planning on having your distro ready in sync with the KF 1.9 release?
Yes
Will you participate by testing your distro during the distribution testing phase and providing feedback (reporting any issues to the release team)?
Yes
For VMware Distro:
Are you planning on having your distro ready in sync with the KF 1.9 release?
Yes
Will you participate by testing your distro during the distribution testing phase and providing feedback (reporting any issues to the release team)?
Yes
Calling all Distribution owners! I'm proud to announce our first Release Candidate for Kubeflow 1.9!
You can find the release details in the following URL:
https://github.com/kubeflow/manifests/releases/tag/v1.9.0-rc.0
We'll be working on another Release Candidate when we have Notebooks and KServe Models Webapp updated and ready for KF 1.9. We can use this issue to keep track of blocker issues for distributions while we work on fixing them.
cc @ca-scribner @yhwang @johnugeorge @nagar-ajay @thesuperzapper @liuqi @xujinheng @alexeadem @alex-treebeard
We also have to update cert-manager, knative, istio, seldon, bentoml etc which will come in later RCs.
@ca-scribner @yhwang @johnugeorge @nagar-ajay @thesuperzapper @liuqi @xujinheng @alexeadem @alex-treebeard Can you please acknowledge that you are aware of Kubeflow 1.9 RC0 and are aware the the distributions testing phase has started? Please react with a thumbs up if everything is okay from your side and you are proceeding with testing.
deployKF is mostly waiting on the updates from Notebooks (https://github.com/kubeflow/kubeflow/issues/7453), but I am aware that a 1.9.0-RC0 was cut with other components.
What do we mean by '(around 1.28)' here: https://github.com/kubeflow/manifests/tree/v1.9.0-rc.0?tab=readme-ov-file#prerequisites
Is that v1.28.0 and v1.27.11?
I'm proceeding with the testing in QBO.
OK: Everything is looking good in QBO. Tested by doing a vector addition test.
Details:
git branch
* (HEAD detached at v1.9.0-rc.0)
In Kubernetes v1.28.0:
qbo get nodes kubeflow_v1_9_0_nvidia | jq .nodes[]?.image
"kindest/node:v1.28.0"
"kindest/node:v1.28.0"
"kindest/node:v1.28.0"
with NVIDIA GPU Operator
helm list -n gpu-operator
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /home/alex/.qbo/kubeflow_v1_9_0_nvidia.conf
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /home/alex/.qbo/kubeflow_v1_9_0_nvidia.conf
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
gpu-operator-1715634796 gpu-operator 1 2024-05-13 21:13:18.636880948 +0000 UTC deployed gpu-operator-v24.3.0 v24.3.0
And Kustomize
./kustomize version
v5.4.1
- There is only a small change I had to do:
It looks like
platform-agnostic-multi-user-pnsis not longer available./kustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user-pns | kubectl apply -f -
as per https://github.com/kubeflow/pipelines/issues/5285
So I used the following instead. I'll update the QBOT installer for this version
./kustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user | kubectl apply -f -
This is what it was deployed
kubectl get pods --all-namespaces -o jsonpath="{..image}" | sed 's/ /\n/g' | sort | uniq
docker.io/istio/pilot:1.17.5
docker.io/istio/proxyv2:1.17.5
docker.io/kindest/kindnetd:v20220726-ed811e41
docker.io/kindest/local-path-provisioner:v0.0.22-kind.0
docker.io/kserve/kserve-controller:v0.12.1
docker.io/kserve/models-web-app:v0.10.0
docker.io/kubeflow/training-operator:v1-f8f7363
docker.io/kubeflowkatib/katib-controller:v0.17.0-rc.0
docker.io/kubeflowkatib/katib-db-manager:v0.17.0-rc.0
docker.io/kubeflowkatib/katib-ui:v0.17.0-rc.0
docker.io/kubeflownotebookswg/centraldashboard:v1.8.0
docker.io/kubeflownotebookswg/jupyter-scipy:v1.8.0
docker.io/kubeflownotebookswg/jupyter-web-app:v1.8.0
docker.io/kubeflownotebookswg/kfam:v1.8.0
docker.io/kubeflownotebookswg/notebook-controller:v1.8.0
docker.io/kubeflownotebookswg/poddefaults-webhook:v1.8.0
docker.io/kubeflownotebookswg/profile-controller:v1.8.0
docker.io/kubeflownotebookswg/pvcviewer-controller:v1.8.0
docker.io/kubeflownotebookswg/tensorboard-controller:v1.8.0
docker.io/kubeflownotebookswg/tensorboards-web-app:v1.8.0
docker.io/kubeflownotebookswg/volumes-web-app:v1.8.0
docker.io/library/mysql:8.0.29
docker.io/library/python:3.7
docker.io/metacontrollerio/metacontroller:v2.0.4
gcr.io/knative-releases/knative.dev/eventing/cmd/controller@sha256:92967bab4ad8f7d55ce3a77ba8868f3f2ce173c010958c28b9a690964ad6ee9b
gcr.io/knative-releases/knative.dev/eventing/cmd/webhook@sha256:ebf93652f0254ac56600bedf4a7d81611b3e1e7f6526c6998da5dd24cdc67ee1
gcr.io/knative-releases/knative.dev/net-istio/cmd/controller@sha256:421aa67057240fa0c56ebf2c6e5b482a12842005805c46e067129402d1751220
gcr.io/knative-releases/knative.dev/net-istio/cmd/webhook@sha256:bfa1dfea77aff6dfa7959f4822d8e61c4f7933053874cd3f27352323e6ecd985
gcr.io/knative-releases/knative.dev/serving/cmd/activator@sha256:c2994c2b6c2c7f38ad1b85c71789bf1753cc8979926423c83231e62258837cb9
gcr.io/knative-releases/knative.dev/serving/cmd/autoscaler@sha256:8319aa662b4912e8175018bd7cc90c63838562a27515197b803bdcd5634c7007
gcr.io/knative-releases/knative.dev/serving/cmd/controller@sha256:98a2cc7fd62ee95e137116504e7166c32c65efef42c3d1454630780410abf943
gcr.io/knative-releases/knative.dev/serving/cmd/domain-mapping-webhook@sha256:7368aaddf2be8d8784dc7195f5bc272ecfe49d429697f48de0ddc44f278167aa
gcr.io/knative-releases/knative.dev/serving/cmd/domain-mapping@sha256:f66c41ad7a73f5d4f4bdfec4294d5459c477f09f3ce52934d1a215e32316b59b
gcr.io/knative-releases/knative.dev/serving/cmd/webhook@sha256:4305209ce498caf783f39c8f3e85dfa635ece6947033bf50b0b627983fd65953
gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1
gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0
gcr.io/ml-pipeline/api-server:2.2.0
gcr.io/ml-pipeline/cache-deployer:2.2.0
gcr.io/ml-pipeline/cache-server:2.2.0
gcr.io/ml-pipeline/frontend:2.2.0
gcr.io/ml-pipeline/metadata-envoy:2.2.0
gcr.io/ml-pipeline/metadata-writer:2.2.0
gcr.io/ml-pipeline/minio:RELEASE.2019-08-14T20-37-41Z-license-compliance
gcr.io/ml-pipeline/mysql:8.0.26
gcr.io/ml-pipeline/persistenceagent:2.2.0
gcr.io/ml-pipeline/scheduledworkflow:2.2.0
gcr.io/ml-pipeline/viewer-crd-controller:2.2.0
gcr.io/ml-pipeline/visualization-server:2.2.0
gcr.io/ml-pipeline/workflow-controller:v3.4.16-license-compliance
gcr.io/tfx-oss-public/ml_metadata_store_server:1.14.0
ghcr.io/dexidp/dex:v2.36.0
kserve/kserve-controller:v0.12.1
kserve/models-web-app:v0.10.0
kubeflow/training-operator:v1-f8f7363
kubeflownotebookswg/jupyter-scipy:v1.8.0
mysql:8.0.29
nvcr.io/nvidia/cloud-native/gpu-operator-validator:v24.3.0
nvcr.io/nvidia/gpu-operator:v24.3.0
nvcr.io/nvidia/k8s-device-plugin:v0.15.0-ubi8
nvcr.io/nvidia/k8s/container-toolkit:v1.15.0-ubuntu20.04
nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.7.1-ubuntu20.04
nvcr.io/nvidia/k8s/dcgm-exporter:3.3.5-3.4.1-ubuntu22.04
python:3.7
quay.io/jetstack/cert-manager-cainjector:v1.12.2
quay.io/jetstack/cert-manager-controller:v1.12.2
quay.io/jetstack/cert-manager-webhook:v1.12.2
quay.io/oauth2-proxy/oauth2-proxy:v7.6.0
registry.k8s.io/coredns/coredns:v1.10.1
registry.k8s.io/etcd:3.5.9-0
registry.k8s.io/kube-apiserver:v1.28.0
registry.k8s.io/kube-controller-manager:v1.28.0
registry.k8s.io/kube-proxy:v1.28.0
registry.k8s.io/kube-scheduler:v1.28.0
registry.k8s.io/nfd/node-feature-discovery:v0.15.4
@alexeadem please check the updated release notes https://github.com/kubeflow/manifests/releases/tag/v1.9.0-rc.0 1.27-1.29 officially Yes, we made emissary the default in 1.7 or 1.8