grafana-operator
grafana-operator copied to clipboard
Enable Grafana Operator for restricted network environments
Grafana Operator must meet additional requirements to run properly in a restricted network, or disconnected, environment as described in Openshift documentation : https://docs.openshift.com/container-platform/4.9/operators/operator_sdk/osdk-generating-csvs.html?extIdCarryOver=true&sc_cid=701f2000001OH7JAAW#olm-enabling-operator-for-restricted-network_osdk-generating-csvs
@ljuaneda looking at the requirements from OpenShift, this seems doable.
@pb82 do you intend to add this feature in next release ?
also related to https://github.com/grafana-operator/grafana-operator/issues/333
@ljuaneda if you can add a PR for this it would be great.
I was able to get the grafana operator working in an Openshift cluster disconnected from the internet.
First I followed steps similar to those mentioned in #763. As mentioned in #333, the ImageContentSourcePolicy will only work when the images are specified with using its sha256 digest. Since the images in ClusterServiceVersion are instead using tags, the grafana operator deployment gets stuck with ErrImagePull (see output in #763).
I was able to solve the error by patching the ClusterServiceVersion resource in the cluster. First create a patch file like this (Note: I'm a few versions behind. These are the sha256 for images from version 4.4.1):
cat <<EOT > patch-csv-for-disconnected.yaml
# Fix the images for the grafana operator itself
- op: replace
path: /spec/install/spec/deployments/0/spec/template/spec/containers/0/image
value: gcr.io/kubebuilder/kube-rbac-proxy@sha256:db06cc4c084dd0253134f156dddaaf53ef1c3fb3cc809e5d81711baa4029ea4c
- op: replace
path: /spec/install/spec/deployments/0/spec/template/spec/containers/1/image
value: quay.io/grafana-operator/grafana-operator@sha256:4553e3e3dbce24a351d7ade3642b878be857b08e24d0d22303207bac0eca815d
# Fix the images used by the grafana operator to launch grafana instances
- op: add
path: /spec/install/spec/deployments/0/spec/template/spec/containers/1/env/-
value:
name: GRAFANA_IMAGE_URL
value: docker.io/grafana/grafana@sha256
- op: add
path: /spec/install/spec/deployments/0/spec/template/spec/containers/1/env/-
value:
name: GRAFANA_IMAGE_TAG
value: 8e6fe7907f8e5c5547bee5e3e8be8165144d86ad98581d6d092044aa5f805c39
- op: add
path: /spec/install/spec/deployments/0/spec/template/spec/containers/1/env/-
value:
name: GRAFANA_PLUGINS_INIT_CONTAINER_IMAGE_URL
value: quay.io/grafana-operator/grafana_plugins_init@sha256
- op: add
path: /spec/install/spec/deployments/0/spec/template/spec/containers/1/env/-
value:
name: GRAFANA_PLUGINS_INIT_CONTAINER_IMAGE_TAG
value: be63237852d5cc53c95246bec4e5871caaf094c5bb128c6ea679ec81d2bb417a
# I find log level info much more useful than the default of error. Also
# add the --scan-all option so dashboards can be defined in any namespace.
- op: replace
path: /spec/install/spec/deployments/0/spec/template/spec/containers/1/args
value:
- --health-probe-bind-address=:8081
- --metrics-bind-address=127.0.0.1:8080
- --scan-all
- --zap-log-level=info
EOT
The last patch is not required for disconnected. For myself, I wanted to have log level at info and I wanted to use the --scan-all
option.
Apply the above patch using a command like this:
# Adjust the namespace as needed and adjust the CSV name
# as needed for your version
kubectl -n grafana patch csv grafana-operator.v4.4.1 \
--patch-file patch-csv-for-disconnected.yaml --type json
It took several minutes for the OLM operator to decide to redeploy the grafana-operator-controller-manager
with the changes from the patched CSV, but once it did, kubelet is able to pull the images and the pod comes up successfully.
To be more concrete about what the CSV patch does, here is a command showing the changes it made:
$ diff <(kubectl -n grafana get csv grafana-operator.v4.4.1 -o yaml) <(kubectl -n grafana patch csv grafana-operator.v4.4.1 --patch-file openshift/monitoring/grafana-operator/patch-csv-for-disconnected.yaml --type json --dry-run=client -o yaml)
353c353
< image: gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0
---
> image: gcr.io/kubebuilder/kube-rbac-proxy@sha256:db06cc4c084dd0253134f156dddaaf53ef1c3fb3cc809e5d81711baa4029ea4c
363c363,364
< - --zap-log-level=error
---
> - --scan-all
> - --zap-log-level=info
375c376,384
< image: quay.io/grafana-operator/grafana-operator:v4.4.1
---
> - name: GRAFANA_IMAGE_URL
> value: docker.io/grafana/grafana@sha256
> - name: GRAFANA_IMAGE_TAG
> value: 8e6fe7907f8e5c5547bee5e3e8be8165144d86ad98581d6d092044aa5f805c39
> - name: GRAFANA_PLUGINS_INIT_CONTAINER_IMAGE_URL
> value: quay.io/grafana-operator/grafana_plugins_init@sha256
> - name: GRAFANA_PLUGINS_INIT_CONTAINER_IMAGE_TAG
> value: be63237852d5cc53c95246bec4e5871caaf094c5bb128c6ea679ec81d2bb417a
> image: quay.io/grafana-operator/grafana-operator@sha256:4553e3e3dbce24a351d7ade3642b878be857b08e24d0d22303207bac0eca815d
I'm willing to work on a pull request for this, but I'm not clear which files need to change. Is bundle/manifests/grafana-operator.clusterserviceversion.yaml
a generated file? If not then making the changes is simple matter of making the above changes to that file. But maybe the deployment information in that file is derived from the kustomization in config/manager
and instead patches should be added/modified there?
Also, maybe the PREPARE_RELEASE.md
needs instructions added for discovering the sha256 digest of the images and use those during the release steps?
One other change is very helpful for the disconnected operation. If the related images are specified in the CSV, then the openshift mirroring command will automatically bring this images in. As it is now, only the grafana operator images are mirrored. The grafana images need to be mirrored manually.
relatedImages:
- image: >-
docker.io/grafana/grafana@sha256:8e6fe7907f8e5c5547bee5e3e8be8165144d86ad98581d6d092044aa5f805c39
name: grafana/grafana
- image: >-
quay.io/grafana-operator/grafana_plugins_init@sha256:be63237852d5cc53c95246bec4e5871caaf094c5bb128c6ea679ec81d2bb417a
name: grafana-operator/grafana_plugins_init
Any thoughts?
Any update regarding this? Limits the usage of the operator, as a lot of installations are usually disconnected
Any update?
Work ongoing in https://github.com/grafana-operator/grafana-operator/pull/1234
Fixed in #1234