kubewarden-controller
kubewarden-controller copied to clipboard
Feature Request: Kubewarden Telemetry
Is your feature request related to a problem?
In an attempt to better understand how users benefit from Kubewarden, It would be interesting to start collecting various information from running service. This is would be useful for prioritizing our future effort. It takes time to collect enough data so I think it's about the right time to think about what we should collect, why, and how.
Solution you'd like
While it's not exhaustive, It would be interesting to have access to the following metrics.
- Running Kubewarden instances
- Kubewarden policies installed
- Kubewarden policies version installed
- Kubewarden controller version
- Kubernetes version used
Alternatives you've considered
No response
Anything else?
Endpoint
Once collected, the metrics must be sent to some service. This service still has to be identified. Longhorn and Harvester project also implemented telemetry so it could be useful to better understand how they implemented it.
Links:
Here are some links that could be useful to better understand the topic:
I agree, this would provide some useful insights.
I've one concern about collecting these information:
Kubewarden policies installed Kubewarden policies version installed
We definitely need that, but I see the risk of leaking names of policies created ad-hoc by our end users. I think we could solve this problem by:
- Looking at the Kubewarden metadata embedded into the
policy.wasm
file - Look at the
io.kubewarden.policy.url
attribute - Ignore the policies that don't have their code hosted on
https://github.com/kubewarden/
Ignore the policies that don't have their code hosted on https://github.com/kubewarden/ That's a very good point
Also I learned that CNCF could provide a platform to store the data, such as https://metrics.longhorn.io/
If we start to collect this kind of info, don't we need to ask permission from users? Otherwise, won't we violating GDPR?
Thinking about that... is this kind of info cover by GDPR? Or is it only for personal info? :thinking:
GDPR (or other) cover personal data; in this case, maybe resource names, IPs, etc. Making recollection anonymous enough and making sure we don't log IPs etc should make it safe.
Still, there's the etiquette with our users. I personally don't like telemetry-by-default, yet I see value for the user, if it's only about notifying the user about what needs an update or not for example, and they can of course opt-out of the already minimal telemetry.