grafana-operator
grafana-operator copied to clipboard
Add status.condition to grafandashboards and grafanadatsources
Is your feature request related to a problem? Please describe. As a user of the grafana operator I can't look at the grafanadashboard/datasource CR and see if it is applied to my grafana instance. This creates issues for me as a user since I can't know if the operator have done its job and applied the dashboard to grafana without going in to grafana.
Describe the solution you'd like My initial idea is to use type: Ready with a message in it following kstatus
Since it's supported for controllers that don't own a specific resource, for example a grafanadashboard to provide it with a condition I think it would be nice when grafana controller applies a specific dashboard to the grafana instance it updates the grafanadashboad conditions field.
apiVersion: integreatly.org/v1alpha1
kind: GrafanaDashboard
metadata:
name: simple-dashboard
labels:
app: grafana
spec:
json: >
{
"id": null,
"title": "Such a nice dashboard",
"links": []
}
status:
conditions:
- lastTransitionTime: "2021-10-19T13:05:52Z"
message: 'Grafana instance X returned 200'
reason: DashboardApplied
status: "True"
type: Ready
Same feature would apply to datasources and notificationchannels with similar fields.
By following the kstatus fields we could use gitops tools like flux and it's built in health feature to make sure that a dashboard have been applied to a grafana instance thus getting fast feedback if my grafana dashboards is working as intended.
Describe alternatives you've considered There are other options then status.condition but it seems like a good solution but the status filed probably have to be used in some way.
We could also update the grafana instance status field but to me that would be to update the wrong CR.
Additional context Add any other context or screenshots about the feature request here. Good blog about conditions: https://maelvls.dev/kubernetes-conditions/
Kubernetes API conventions: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status
Kstatus: https://github.com/kubernetes-sigs/cli-utils/blob/master/pkg/kstatus/README.md
I'm myself very new to status.conditions and trying to wrap my head around it but doing so I think it would really help our users to get fast feedback from the operator.
Existing solutions Today we do not use status on grafanadashboards instead you have to login in to your grafana instance and see if the dashboard have been created. If not you have to go in to the grafana-operator logs/yaml and debug if you have set correct label selector/if grafana didn't like your dashboard.
If adding the status.condition we will need to take in to account what happens if a new grafana instance is created to replace the old one. We should probably update the condition on all grafanadashboards with a new status.
As I see it we can go two ways here. Ether we can just remove the status.controller completely from the grafanadashboard CR since if no status.controller
is defined it assumes unknown and when we have added the grafanadashboard to the newly created grafaninstance again we can add the new status.controller info again.
This might be seen as a bit strange from a user point of view though, if the resource had a status.controller before and now it's all of the sudden gone. So it might be reasonable to update the existing status.controller status with something like:
apiVersion: integreatly.org/v1alpha1
kind: GrafanaDashboard
metadata:
name: simple-dashboard
labels:
app: grafana
spec:
json: >
{
"id": null,
"title": "Such a nice dashboard",
"links": []
}
status:
conditions:
- lastTransitionTime: "2021-10-19T13:05:52Z"
message: 'Creating new grafana instance'
reason: Reconciling
status: "False"
type: Ready
I'm just thinking out loud right now but it's something that we need to think about ^^
Accepted, We're going to initially implement a PoC for this in a single resource, to gather feedback, get opinions and see how this affects things, if successful we'll try to implement similar logic for the remaining resources. This also requires some special consideration for multi-namespace support, and we also have to consider this as a liability in terms of compatibility between versions going forwards. We can pick this up after implementing #433.
This have been solved in the v5 implementation.