jaeger-operator
jaeger-operator copied to clipboard
CR failures should be reported via status object
This issue somehow related to #466, however this may need a different fix
Invalid CR files error should be reported to the user console. And should not keep on trying to deploy the CR file.
I changed the boolean value to string (eg: true >> "true"
) and integer to string (ex: 10 >> "10"
)
deployed the CR file. on the console, it looks all ok. But on the operator-log keep on throwing an exception.
Expected behavior:
- The error should be thrown on the user console [OR] the client side
- AFAIK, There is no use to keep on trying to deploy the invalid CR file. Stop it on the first failure
- I believe if we read everything as a string and convert it to the specific format may avoid most of the format issues.
No exception on the console.
$ oc create -f jaeger-services-final.yaml
jaeger.jaegertracing.io/jaegerqe created
storage:
type: elasticsearch
esIndexCleaner:
enabled: true
schedule: "*/10 * * * *"
numberOfDays: "2"
operator log file: jaeger-operator-545c69455d-4clq5.log
CR file:
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: "jaegerqe"
spec:
ingress:
security: none
strategy: production
collector:
replicas: 1
image: jaegertracing/jaeger-collector:1.12
resources:
requests:
memory: "512Mi"
cpu: "1"
limits:
memory: "512Mi"
cpu: "1"
options:
log-level: info
metrics-backend: prometheus
collector:
num-workers: 1
queue-size: 20000
es:
bulk:
size: 524288
workers: 1
flush-interval: 200ms
tags-as-fields:
all: false
query:
replicas: 1
image: jaegertracing/jaeger-query:1.12
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "512Mi"
options:
log-level: "info"
metrics-backend: prometheus
query:
port: 16686
es:
timeout: 10s
agent:
strategy: sidecar
image: jaegertracing/jaeger-agent:1.12
resources:
requests:
cpu: "200m"
memory: "128Mi"
limits:
cpu: "200m"
memory: "128Mi"
options:
log-level: info
metrics-backend: prometheus
processor:
jaeger-compact:
server-queue-size: 10000
workers: 10
storage:
type: elasticsearch
esIndexCleaner:
enabled: true
schedule: "*/10 * * * *"
numberOfDays: "2"
dependencies:
enabled: false
elasticsearch:
image: quay.io/openshift/origin-logging-elasticsearch5:latest
nodeCount: 3
resources:
But on the operator-log keep on throwing an exception.
What's the exception?
@jpkrohling Operator log is attached on the initial comment.
E0607 06:27:47.976474 1 reflector.go:125] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:93: Failed to list *v1.Jaeger: v1.JaegerList.Items: []v1.Jaeger: v1.Jaeger.Spec: v1.JaegerSpec.Storage: v1.JaegerStorageSpec.EsIndexCleaner: v1.JaegerEsIndexCleanerSpec.NumberOfDays: readUint64: unexpected character: �, error found in #10 byte of ...|rOfDays":"2","schedu|..., bigger context ...|,"esIndexCleaner":{"enabled":true,"numberOfDays":"2","schedule":"*/10 * * * *"},"type":"elasticsearc|...
Sorry, I missed that. Whenever possible, let's try to keep the error messages in the issue itself, so that people searching for the error message can find this bug report.
I changed the scope of this issue a bit, but the idea is the same: users creating a CR should be able to see the error messages that happened during the process. Storing the error in a status object allows it to be viewed via kubectl describe jaeger
I recommend a Validating Webhook rather than admitting the jaegertracing.io/v1 Jaeger and then setting its Status.
The only time it is reasonable to set the Status is if the status can't be checked at admission time. (For example, a Jaeger can have a priorityClassName: collector-high-priority
but at admission time the scheduling.k8s.io/v1 PriorityClass might not exist yet.). The only way to implement this is to watch for changes on the CR and every kind of object the CR might refer to, which is why this kind of checking is rarely implemented. (In Istio that kind of checking is opt-in because of performance problems.)