chaosblade-operator
chaosblade-operator copied to clipboard
cant install chaosblade-operator-1.5.0
Issue Description
Type: bug
Describe what happened (or what feature you want)
I've installed chaosblade-operator v1.5.0 once before, and uninstalled it, by using
helm del chaosblade-operator -n kube-system
kubectl delete deployment chaosblade-operator -n kube-system
kubectl delete crd chaosblades.chaosblade.io
When i trying to install chaosblade-operator v1.5.0 again, Could not install successfully. i've tried it in 2 environments, both are the same error.
kubectl get deploy chaosblade-operator -n kube-system -o json
result
"status": { "conditions": [ { "lastTransitionTime": "2022-02-02T06:25:28Z", "lastUpdateTime": "2022-02-02T06:25:28Z", "message": "Created new replica set "chaosblade-operator-5ccf675967"", "reason": "NewReplicaSetCreated", "status": "True", "type": "Progressing" }, { "lastTransitionTime": "2022-02-02T06:25:28Z", "lastUpdateTime": "2022-02-02T06:25:28Z", "message": "Deployment does not have minimum availability.", "reason": "MinimumReplicasUnavailable", "status": "False", "type": "Available" }, { "lastTransitionTime": "2022-02-02T06:25:29Z", "lastUpdateTime": "2022-02-02T06:25:29Z", "message": "Internal error occurred: failed calling webhook "chaosblade-operator.kube-system.svc": failed to call webhook: Post "https://chaosblade-webhook-server.kube-system.svc:443/mutating-pods?timeout=10s": dial tcp 10.101.171.120:443: connect: connection refused", "reason": "FailedCreate", "status": "True", "type": "ReplicaFailure" } ], "observedGeneration": 1, "unavailableReplicas": 1 }
kubectl describe replicaset chaosblade-operator-67779995db -n kube-system
result
Name: chaosblade-operator-6db889f86 Namespace: kube-system Selector: name=chaosblade-operator,pod-template-hash=6db889f86 Labels: name=chaosblade-operator part-of=chaosblade pod-template-hash=6db889f86 Annotations: deployment.kubernetes.io/desired-replicas: 1 deployment.kubernetes.io/max-replicas: 2 deployment.kubernetes.io/revision: 1 meta.helm.sh/release-name: chaosblade-operator meta.helm.sh/release-namespace: kube-system Controlled By: Deployment/chaosblade-operator Replicas: 0 current / 1 desired Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed Pod Template: Labels: name=chaosblade-operator part-of=chaosblade pod-template-hash=6db889f86 Service Account: chaosblade Containers: chaosblade-operator: Image: chaosbladeio/chaosblade-operator:1.5.0 Port: 9443/TCP Host Port: 0/TCP Command: chaosblade-operator Args: --chaosblade-image-repository=chaosbladeio/chaosblade-tool --chaosblade-version=1.5.0 --chaosblade-image-pull-policy=IfNotPresent --log-level=info --webhook-enable --daemonset-enable --remove-blade-interval=72h --chaosblade-namespace=kube-system Environment: WATCH_NAMESPACE: POD_NAME: (v1:metadata.name) OPERATOR_NAME: chaosblade-operator Mounts: /tmp/k8s-webhook-server/serving-certs from cert (ro) Volumes: cert: Type: Secret (a volume populated by a Secret) SecretName: chaosblade-webhook-server-cert Optional: false Conditions: Type Status Reason ReplicaFailure True FailedCreate Events: Type Reason Age From Message Warning FailedCreate 19s (x14 over 60s) replicaset-controller Error creating: Internal error occurred: failed calling webhook "chaosblade-operator.kube-system.svc": failed to call webhook: Post "https://chaosblade-webhook-server.kube-system.svc:443/mutating-pods?timeout=10s": dial tcp 10.104.96.126:443: connect: connection refused
And it also make all the k8s yamls cannot to be submitted, unless i uninstall chaosblade-operator-1.5.0.
Describe what you expected to happen
install chaosblade-operator unsuccessfully
How to reproduce it (as minimally and precisely as possible)
reinstall chaosblade-operator v1.5.0
helm --debug install chaosblade-operator chaosblade-operator-1.5.0.tgz --namespace kube-system --set webhook.enable=true
Tell us your environment
- Kubernetes v1.22.2
- helm v3.7.0
- plantform: linux/amd64
Anything else we need to know?
I don't know if it's just me having this problem or someone else has, I used chaosblade-operator v1.2.0 before, k8s is version v1.21.0, and I have not encountered similar problems before.
I met it either, do you resolve it ?
there is nor pod and images in my cluster, the event is normal.
when i do : kubectl get deploy chaosblade-operator -n chaosblade -o json
it said:
{
"apiVersion": "apps/v1",
"kind": "Deployment",
"metadata": {
"annotations": {
"deployment.kubernetes.io/revision": "1",
"meta.helm.sh/release-name": "chaosblade-operator",
"meta.helm.sh/release-namespace": "chaosblade"
},
"creationTimestamp": "2022-02-10T02:26:43Z",
"generation": 1,
"labels": {
"app.kubernetes.io/managed-by": "Helm"
},
"name": "chaosblade-operator",
"namespace": "chaosblade",
"resourceVersion": "18542",
"uid": "d7b7d5bf-c8ca-4f5c-8b2a-7705864fb0d8"
},
"spec": {
"progressDeadlineSeconds": 600,
"replicas": 1,
"revisionHistoryLimit": 10,
"selector": {
"matchLabels": {
"name": "chaosblade-operator"
}
},
"strategy": {
"rollingUpdate": {
"maxSurge": "25%",
"maxUnavailable": "25%"
},
"type": "RollingUpdate"
},
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"name": "chaosblade-operator",
"part-of": "chaosblade"
}
},
"spec": {
"containers": [
{
"args": [
"--chaosblade-image-repository=chaosbladeio/chaosblade-tool",
"--chaosblade-version=1.5.0",
"--chaosblade-image-pull-policy=IfNotPresent",
"--log-level=info",
"--webhook-enable",
"--daemonset-enable",
"--remove-blade-interval=72h",
"--chaosblade-namespace=chaosblade"
],
"command": [
"chaosblade-operator"
],
"env": [
{
"name": "WATCH_NAMESPACE"
},
{
"name": "POD_NAME",
"valueFrom": {
"fieldRef": {
"apiVersion": "v1",
"fieldPath": "metadata.name"
}
}
},
{
"name": "OPERATOR_NAME",
"value": "chaosblade-operator"
}
],
"image": "chaosbladeio/chaosblade-operator:1.5.0",
"imagePullPolicy": "IfNotPresent",
"name": "chaosblade-operator",
"ports": [
{
"containerPort": 9443,
"protocol": "TCP"
}
],
"resources": {},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"volumeMounts": [
{
"mountPath": "/tmp/k8s-webhook-server/serving-certs",
"name": "cert",
"readOnly": true
}
]
}
],
"dnsPolicy": "ClusterFirst",
"restartPolicy": "Always",
"schedulerName": "default-scheduler",
"securityContext": {},
"serviceAccount": "chaosblade",
"serviceAccountName": "chaosblade",
"terminationGracePeriodSeconds": 30,
"volumes": [
{
"name": "cert",
"secret": {
"defaultMode": 420,
"secretName": "chaosblade-webhook-server-cert"
}
}
]
}
}
},
"status": {
"conditions": [
{
"lastTransitionTime": "2022-02-10T02:26:43Z",
"lastUpdateTime": "2022-02-10T02:26:43Z",
"message": "Deployment does not have minimum availability.",
"reason": "MinimumReplicasUnavailable",
"status": "False",
"type": "Available"
},
{
"lastTransitionTime": "2022-02-10T02:26:44Z",
"lastUpdateTime": "2022-02-10T02:26:44Z",
"message": "Internal error occurred: failed calling webhook "chaosblade-operator.chaosblade.svc": Post "https://chaosblade-webhook-server.chaosblade.svc:443/mutating-pods?timeout=10s": dial tcp 10.1.255.232:443: connect: connection refused",
"reason": "FailedCreate",
"status": "True",
"type": "ReplicaFailure"
},
{
"lastTransitionTime": "2022-02-10T02:36:44Z",
"lastUpdateTime": "2022-02-10T02:36:44Z",
"message": "ReplicaSet "chaosblade-operator-7f7b79fcc5" has timed out progressing.",
"reason": "ProgressDeadlineExceeded",
"status": "False",
"type": "Progressing"
}
],
"observedGeneration": 1,
"unavailableReplicas": 1
}
}
my environment is : kubernetes v1.20.0, helm v3.8.0, plantform: linux/amd64.
actually, I use chaosblade-operator-1.4.0-v3.tgz, it worked.
maybe the cause root is that the pod was not ready, but had registered the api server. im just guessing. https://imroc.cc/post/201912/kubernetes-no-route-to-host/#%E9%97%AE%E9%A2%98%E5%8F%8D%E9%A6%88