grafana-operator icon indicating copy to clipboard operation
grafana-operator copied to clipboard

[Bug] Since 5.13.0 - Grafana Operator cannot manage TLS protected internal Grafanas

Open diranged opened this issue 1 year ago • 4 comments

Describe the bug It seems in our environment (where we pass in TLS certs to our Grafana service so that it's encrypted end to end) that the Grafana Operator stopped being able to connect with our Grafana instances after https://github.com/grafana/grafana-operator/pull/1628 was shipped in 5.13.0. We get the following reconciliation errors:

    "status": {
        "hash": "9250f003846c19a973bd035ce560da23aaad2fdc855a951d63c99d75b7c40a03",
        "lastMessage": "fetching data sources: Get \"https://grafana-app-service.grafana:3000/api/datasources\": tls: failed to verify certificate: x509: certificate signed by unknown authority",
        "lastResync": "2024-09-12T17:50:18Z",
        "uid": "loki"
    }

Logs:

2024-09-13T20:58:43Z	ERROR	GrafanaDatasourceReconciler	error reconciling datasource	{"controller": "grafanadatasource", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaDatasource", "GrafanaDatasource": {"name":"grafana-app-root","namespace":"grafana"}, "namespace": "grafana", "name": "grafana-app-root", "reconcileID": "c1671f2b-3f25-436b-a479-7b2fe96edbdf", "datasource": "grafana-app-root", "grafana": "grafana-app", "error": "fetching data sources: Get \"https://grafana-app-service.grafana:3000/api/datasources\": tls: failed to verify certificate: x509: certificate signed by unknown authority"}
github.com/grafana/grafana-operator/v5/controllers.(*GrafanaDatasourceReconciler).Reconcile
	github.com/grafana/grafana-operator/v5/controllers/datasource_controller.go:252
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:261
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:222

Version v5.13.0

To Reproduce

Create a Grafana with a TLS config...

config:
...
  server:
    ca_cert: /certs/ca.crt
    cert_file: /certs/tls.crt
    cert_key: /certs/tls.key
    domain: ....com
    protocol: https
    root_url: https://....com
deployment:
  spec:
    template:
      spec:
        containers:
          - volumeMounts:
              - mountPath: /certs/ca.crt
                name: ca
                readOnly: true
                subPath: ca.crt
              - mountPath: /certs/tls.crt
                name: tls
                readOnly: true
                subPath: tls.crt
              - mountPath: /certs/tls.key
                name: tls
                readOnly: true
                subPath: tls.key
        volumes:
          - name: ca
            secret:
              defaultMode: 420
              optional: false
              secretName: grafana-app-cacert
          - name: tls
            secret:
              defaultMode: 420
              optional: false
              secretName: grafana-app-tls

diranged avatar Sep 13 '24 21:09 diranged

Thanks for reporting this. The TLS settings introduced in #1628 should have only affected external instances, but it had the unintended side effect of requiring complete certificate chains on all instances.

As a workaround, you can try mounting your ca.crt in the operator manager container under /etc/ssl/certs/ca-certificates.crt until we have a fix ready.

theSuess avatar Sep 17 '24 07:09 theSuess

Thanks - for now we just rolled the operator upgrade back...

diranged avatar Sep 17 '24 14:09 diranged

is that issue fixed in v5.14.0 ?

Thanks

brogger71 avatar Oct 10 '24 06:10 brogger71

is that issue fixed in v5.14.0 ?

Thanks

No - the fix in https://github.com/grafana/grafana-operator/pull/1690 is still unmerged.

diranged avatar Oct 10 '24 15:10 diranged