grafana-operator [Bug] Operator seems to talk to the Grafana instance too soon

Describe the bug When (re)-deploying Grafana, we see this in operator logs:

2022-06-13T08:55:17.606Z ERROR failed to get or create namespace folder for dashboard {"folder": "ictsritip-monitoring", "dashboard": "", "error": "Get "http://admin:***@grafana-service.ictsritip-monitoring.svc.cluster.local:3000/api/folders": dial tcp 10.200.100.43:3000: connect: no route to host"} github.com/go-logr/zapr.(*zapLogger).Error /go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:132 sigs.k8s.io/controller-runtime/pkg/log.(DelegatingLogger).Error /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/log/deleg.go:144 github.com/grafana-operator/grafana-operator/v4/controllers/grafanadashboard.(GrafanaDashboardReconciler).reconcileDashboards /workspace/controllers/grafanadashboard/grafanadashboard_controller.go:282 github.com/grafana-operator/grafana-operator/v4/controllers/grafanadashboard.(GrafanaDashboardReconciler).Reconcile /workspace/controllers/grafanadashboard/grafanadashboard_controller.go:100 github.com/grafana-operator/grafana-operator/v4/controllers/grafanadashboard.SetupWithManager.func1 /workspace/controllers/grafanadashboard/grafanadashboard_controller.go:183 github.com/grafana-operator/grafana-operator/v4/controllers/grafanadashboard.SetupWithManager.func2 /workspace/controllers/grafanadashboard/grafanadashboard_controller.go:192 2022-06-13T08:55:17.606Z ERROR error updating dashboard {"error": "Get "http://admin:@grafana-service.ictsritip-monitoring.svc.cluster.local:3000/api/folders": dial tcp 10.200.100.43:3000: connect: no route to host"} github.com/go-logr/zapr.(*zapLogger).Error /go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:132 sigs.k8s.io/controller-runtime/pkg/log.(*DelegatingLogger).Error /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/log/deleg.go:144 github.com/grafana-operator/grafana-operator/v4/controllers/grafanadashboard.(*GrafanaDashboardReconciler).manageError /workspace/controllers/grafanadashboard/grafanadashboard_controller.go:467 github.com/grafana-operator/grafana-operator/v4/controllers/grafanadashboard.(*GrafanaDashboardReconciler).reconcileDashboards /workspace/controllers/grafanadashboard/grafanadashboard_controller.go:283 github.com/grafana-operator/grafana-operator/v4/controllers/grafanadashboard.(*GrafanaDashboardReconciler).Reconcile /workspace/controllers/grafanadashboard/grafanadashboard_controller.go:100 github.com/grafana-operator/grafana-operator/v4/controllers/grafanadashboard.SetupWithManager.func1 /workspace/controllers/grafanadashboard/grafanadashboard_controller.go:183 github.com/grafana-operator/grafana-operator/v4/controllers/grafanadashboard.SetupWithManager.func2 /workspace/controllers/grafanadashboard/grafanadashboard_controller.go:192

The operator works, though, datasources, folders and dashboards do get created in the end.

It seems the operartor tries to talk to the Grafana deployment too early, before the container is ready to accept connection.

Version Grafana operator 4.4.1 Openshift 4.9.29

To Reproduce Deploy or change Grafana CR instance. Whenever Grafana is redeployed, this error appears in the logs.

Expected behavior Operator should wait until Grafana pod is ready before using its API.

Jun 13 '22 10:06 bukovjanmic

there is a readiness check for the deployment, we should figure out why it's not doing what we expect it to do.

Jun 14 '22 11:06 pb82

@bukovjanmic That's weird. In my local setup, I don't see anything like in your example, all logs look fine:

grafana-operator-55f8687bfd-8krpq grafana-operator 2022-07-18T15:18:35.152Z	DEBUG	action-runner	(    7)     FAILED check deployment readiness
grafana-operator-55f8687bfd-8krpq grafana-operator 2022-07-18T15:18:35.152Z	DEBUG	controller-runtime.manager.events	Warning	{"object": {"kind":"Grafana","namespace":"monitoring","name":"grafana-operator-grafana","uid":"e42ba143-2787-41cd-a2f7-6d1f2371235f","apiVersion":"integreatly.org/v1alpha1","resourceVersion":"332628"}, "reason": "ProcessingError", "message": "deployment not ready"}
[...]
grafana-operator-55f8687bfd-8krpq grafana-operator 2022-07-18T15:19:05.248Z	DEBUG	action-runner	(    7)    SUCCESS check deployment readiness
grafana-operator-55f8687bfd-8krpq grafana-operator 2022-07-18T15:19:05.261Z	DEBUG	grafana-controller	desired cluster state met

Is it something that you see consistently? If not, how can we reproduce it?
Could you, please, share your full Grafana spec and also livenessProbe and readinessProbe sections of the generated deployment with grafana instance?

Jul 18 '22 15:07 weisdd

This issue hasn't been updated for a while, marking as stale, please respond within the next 7 days to remove this label

Sep 17 '22 11:09 github-actions[bot]

grafana-operator grafana-operator copied to clipboard

[Bug] Operator seems to talk to the Grafana instance too soon

grafana-operator
grafana-operator copied to clipboard