kfctl icon indicating copy to clipboard operation
kfctl copied to clipboard

Unable to get `kfctl apply` to fully deploy kfctl_k8s_istio.v1.2.0.yaml

Open theNewFlesh opened this issue 4 years ago • 7 comments

Running Kubeflow via minikube start --docker

kfctl_k8s_istio.v1.2.0.yaml is just the file curled from the tutorial.

kfctl apply -V -f kfctl_k8s_istio.v1.2.0.yaml gives me errors:

In the admission-webhook-bootstrap-stateful-set-0 pod logs I find this:

Error from server (NotFound): mutatingwebhookconfigurations.admissionregistration.k8s.io "admission-webhook-mutating-webhook-configuration" not found
patching ca bundle for webhook configuration...
Error from server (NotFound): mutatingwebhookconfigurations.admissionregistration.k8s.io "admission-webhook-mutating-webhook-configuration" not found

Also pods metacontroller-0 and Istio-telemetry both go into a CrashLoopBackoff death spiral without any logs.

Help please.


macOS 11.0.1 minikube v1.16.0 kfctl v1.2.0-0-gbc038f9 docker desktop 3.0.3

theNewFlesh avatar Jan 06 '21 09:01 theNewFlesh

Also found this in metacontroller-0 events:

Error: failed to start container "metacontroller": Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:459: container init caused: process_linux.go:422: setting cgroup config for procHooks process caused: failed to write "400000" to "/sys/fs/cgroup/cpu/kubepods/burstable/pod22bf4ea2-a41c-43e1-a6d4-f807ebc00756/metacontroller/cpu.cfs_quota_us": write /sys/fs/cgroup/cpu/kubepods/burstable/pod22bf4ea2-a41c-43e1-a6d4-f807ebc00756/metacontroller/cpu.cfs_quota_us: invalid argument: unknown

theNewFlesh avatar Jan 06 '21 09:01 theNewFlesh

Minikube setup instruction as follows

https://www.kubeflow.org/docs/started/workstation/minikube-linux/#start-minikube

Could you try to start minikube this way?

moficodes avatar Jan 07 '21 21:01 moficodes

Yeah I tried that first, same issue.

theNewFlesh avatar Jan 07 '21 22:01 theNewFlesh

I've rebuilt everything to see if I can reproduce the issue. Now I get Readiness probe failed: HTTP probe failed with statuscode: 503 for Istio-ingressgateway

Logs: 2021-01-07T23:03:42.107903Z info Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected

theNewFlesh avatar Jan 07 '21 23:01 theNewFlesh

A little help please.

theNewFlesh avatar Jan 11 '21 19:01 theNewFlesh

I was also just having this issue with pretty much the same set up, and found this which seemed to solve the problem: https://github.com/kubeflow/kubeflow/issues/5447#issuecomment-740718229, specifically I ended up modifying the minikube start command to:

minikube start \
--cpus 6 \
--memory 12288 \
--disk-size=120g \
--extra-config=apiserver.service-account-issuer=api \
--extra-config=apiserver.service-account-signing-key-file=/var/lib/minikube/certs/sa.key \
--extra-config=apiserver.service-account-api-audiences=api \
--kubernetes-version v1.16.15

...and now all of the pods have start successfully.

The kubernetes version was set based upon the compatibility matrix although I'm not sure if it's strictly necessary. Hopefully that fixes your issue too.

peteboothroyd avatar Jan 12 '21 22:01 peteboothroyd

I was hit by the same issue on different versions of kubernetes (1.18 to 1.20) The solution was simple - to install cert-manager separately from official manifest:

$ kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.3.0/cert-manager.yaml

and install kubeflow components one-by-one.

Also I checked that the keys --service-account-api-audiences, --service-account-signing-key-file, --service-account-issuer were added to API server, but it is typical setup of many distributions of k8s, so it doesn't look like solution.

gecube avatar Apr 17 '21 09:04 gecube