Failed Operator will leave Serving installation in a partial state.
In what area(s)?
/area autoscale
What version of Knative?
0.11.x
kn version
Version: v1.11.0
Build Date: 2023-07-27 07:42:56
Git Revision: b7508e67
Supported APIs:
* Serving
- serving.knative.dev/v1 (knative-serving v1.11.0)
* Eventing
- sources.knative.dev/v1 (knative-eventing v1.11.0)
- eventing.knative.dev/v1 (knative-eventing v1.11.0)
Expected Behavior
Using kn service create 'hello-example' --image ghcr.io/knative/helloworld-go:latest --env TARGET="First" I'm expecting to deploy a hello-wolrd example to start playing with the knative.
Actual Behavior
kn service create 'hello-example' --image ghcr.io/knative/helloworld-go:latest --env TARGET="First"
Creating service 'hello-example' in namespace 'default':
0.072s The Route is still working to reflect the latest desired specification.
0.072s Configuration "hello-example" is waiting for a Revision to become ready.
0.072s ...
1.153s Revision "hello-example-00001" failed with message: Failed to create new replica set "hello-example-00001-deployment-7b56748d46": Unauthorized.
1.166s Configuration "hello-example" does not have any ready Revision.
1.176s ...
1.179s Configuration "hello-example" is waiting for a Revision to become read
The process starts but doesn't complete. The pod is successfully scheduled in the default namespace and is ready, however the kn service is not
k get pods
NAME READY STATUS RESTARTS AGE
hello-example-00001-deployment-7b56748d46-mt5kk 2/2 Running 0 31s
Steps to Reproduce the Problem
- On MacOS install docker-desktop and enable the k8s
docker-desktopcluster- Engine 25.03, k8s v1.29.1
- Install
operator-sdkusing instructions found here - Install OLM using
operator-sdk olm install - Install an operator using
kubectl create -f https://operatorhub.io/install/knative-operator.yamlor apply this manifest:
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: knative-operator
namespace: operators
spec:
channel: stable
name: knative-operator
source: operatorhubio-catalog
sourceNamespace: olm
- Apply the k8s manifests to enable serving:
apiVersion: v1
kind: Namespace
metadata:
name: knative-serving
---
apiVersion: operator.knative.dev/v1beta1
kind: KnativeServing
metadata:
name: knative-serving
namespace: knative-serving
- Run
kn service create 'hello-example' --image ghcr.io/knative/helloworld-go:latest --env TARGET="First"
Any addiitional details and investigation so far can be found on CNCF slack here
Following up here it looks like the default installation expects Istio and when it is not installed the operator will fail with Ready=False saying the Istio resources are not present.
This halts the installation of other manifests and leaves serving in a weird state. eg. in the above example the mutating & validating webhooks are not installed. This allowed the user to create a Knative Service and it reconciled all then when it created the PodAutoscaler it didn't default a annotation required to select which autoscaler to use.
Ideally it would be good to try to apply all the resources in the manifest and then report all errors the operator installation encounters.
But since the operator did report the failure I think we could just simply document checking the installation in the docs.
I'll leave this issue open for @houshengbo close out and make a docs issue.
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.