opentelemetry-operator
opentelemetry-operator copied to clipboard
Write failed manifests to log at debug level
When the reconciler fails to apply a manifest, there's very little the user can do to figure out exactly what the failed manifest payload was or where it came from.
It would be very helpful to have a debug level log record to capture the manifest payload when apply fails.
For example, this recent error I hit:
{"level":"error","ts":1652677368.8442643,"logger":"controllers.OpenTelemetryCollector","msg":"failed to reconcile config maps","error":"failed to reconcile the expected configmaps: failed to apply changes: ConfigMap \"otel-collector\" is invalid: metadata.labels: Invalid value: \"8f65b4d94bb5290c8fc1540703c06f7a7a12cfd917d2f141bdc8a18803828615\": must be no more than 63 characters","stacktrace":"github.com/open-telemetry/opentelemetry-operator/controllers.(*OpenTelemetryCollectorReconciler).Reconcile\n\t/workspace/controllers/opentelemetrycollector_controller.go:153\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"}
is inscrutable at best without being able to access the generated ConfigMap. The value shown is not present in the input CRD, and when decoded as hex does not appear to have any textual content.
When --zap-log-level=debug is passed, it'd be helpful to have the generated configmap dumped.
The specific error above seems to arise from https://github.com/open-telemetry/opentelemetry-operator/blob/0dce2dfbdaefa86b0052775641a38263a7599266/pkg/collector/reconcile/configmap.go#L188
with a configmap created by https://github.com/open-telemetry/opentelemetry-operator/blob/0dce2dfbdaefa86b0052775641a38263a7599266/pkg/collector/reconcile/configmap.go#L41
See also https://github.com/open-telemetry/opentelemetry-operator/issues/873
When the reconciler fails to apply a manifest, there's very little the user can do to figure out exactly what the failed manifest payload was or where it came from.
What do you mean by manifest? The CR created by user or the OTEL configmap created by the operator. Note that the OTEL configmap should match the collector config from the CR so user has access to it.
closed by #2193 and superseded by #2399