litmus
litmus copied to clipboard
Unable to initialize probes
What happened: When applying the example prometheus probe, litmus fails to initialize the probes.
What you expected to happen: I expect the probes to be initialized and for litmus to probe prometheus
Where can this issue be corrected? (optional)
How to reproduce it (as minimally and precisely as possible):
kubectl apply -f example.yaml
# contains the prom probe which execute the query and match for the expected criteria
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
appinfo:
appns: "default"
applabel: "app.kubernetes.io/instance=dummy-dev"
appkind: "deployment"
chaosServiceAccount: litmus-runner
experiments:
- name: pod-delete
spec:
probe:
- name: "check-probe-success"
type: "promProbe"
promProbe/inputs:
# endpoint for the promethus service
endpoint: "prometheus-kube-prometheus-prometheus.observability.svc.cluster.local:9090"
# promql query, which should be executed
query: "vector(1)"
comparator:
# criteria which should be followed by the actual output and the expected output
#supports >=,<=,>,<,==,!= comparision
criteria: "=="
# expected value, which should follow the specified criteria
value: "1"
mode: "Edge"
runProperties:
probeTimeout: 5
interval: 5
retry: 1
Check logs of pod running the expirement:
➜ litmus k logs -f pod-delete-3xno17-xcj47 -n litmus
W0411 20:53:35.940676 1 client_config.go:541] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
time="2022-04-11T20:53:35Z" level=info msg="Experiment Name: pod-delete"
time="2022-04-11T20:53:35Z" level=info msg="[PreReq]: Getting the ENV for the experiment"
time="2022-04-11T20:53:35Z" level=error msg="Unable to initialize the probes, err: unable to Get the chaosengine, err: v1alpha1.ChaosEngine.Spec: v1alpha1.ChaosEngineSpec.Experiments: []v1alpha1.ExperimentList: v1alpha1.ExperimentList.Spec: v1alpha1.ExperimentAttributes.Probe: []v1alpha1.ProbeAttributes: v1alpha1.ProbeAttributes.CmdProbeInputs: v1alpha1.CmdProbeInputs.Source: ReadString: expects \" or n, but found {, error found in #10 byte of ...|\"source\":{}},\"httpPr|..., bigger context ...|e\":[{\"cmdProbe/inputs\":{\"comparator\":{},\"source\":{}},\"httpProbe/inputs\":{\"method\":{\"get\":{},\"post\":{|..."
Anything else we need to know?:
When running kubectl get chaosengine engine-nginx -n litmus -o yaml
it appears that empty versions of other probe types were added to the resource. The error message seems to indicate that the issue is with the cmdProbe
which we did not mention in our manifest.
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
namespace: litmus
spec:
annotationCheck: "false"
appinfo:
appkind: deployment
applabel: app.kubernetes.io/instance=dummy-dev
appns: defauult
chaosServiceAccount: litmus-runner
components:
runner:
image: litmuschaos/chaos-runner:2.6.0
resources: {}
engineState: stop
experiments:
- name: pod-delete
spec:
components:
resources: {}
statusCheckTimeouts: {}
probe:
- cmdProbe/inputs:
comparator: {}
source: {}
httpProbe/inputs:
method:
get: {}
post: {}
k8sProbe/inputs: {}
mode: Edge
name: check-probe-success
promProbe/inputs:
comparator:
criteria: ==
value: "1"
endpoint: prometheus-kube-prometheus-prometheus.observability.svc.cluster.local:9090
query: vector(1)
runProperties:
interval: 5
probeTimeout: 5
retry: 1
type: promProbe
status:
engineStatus: completed
experiments:
- experimentPod: pod-delete-u545hj-jtx6s
lastUpdateTime: "2022-04-11T21:12:48Z"
name: pod-delete
runner: engine-nginx-runner
status: Completed
verdict: Pass
Tagging @AmitKumarDas
Hi, @robertb724, this issue seems like due to a version mismatch between litmus-go and chaos-operator. Can you try running this with the same (compatible) version for both and see if the issue persists?
Closing due to inactivity, feel free to re-open this issue if the problem persists.