k6
k6 copied to clipboard
High Cardinality Metrics - Prometheus Remote Write (experimental-prometheus-rw)
Feature Description
When using the Prometheus Remote Write functionality it is not currently possible to limit what labels are included in the series sent to prometheus. Currently we have an issue where the full url is included in the k6_http_ prefixed metrics where the unique id's randomly generated by some of our developers tests absolutely explode the cardinality leading to poor performance when querying or when the developers use the k6 grafana dashboard for time spans longer than a few minutes.
For example this urls like these result in high cardinality within prometheus
http://redacted.redacted.svc.cluster.local:8080/api/v1/applications/2b38e5b6-7e72-41a2-9f4d-9f2e51787e78
I could ask the developers to use a smaller fixed number of ID's but this doesn't seem feasible in the long run.
Suggested Solution (optional)
I'm looking for something similar to Prometheus' relabel config functionality (regex matching and manipulation) or have the k6 library have some way to mark/sanitize urls containing unique ID's (ex: opentelemetry tracing does this automatically based on common http server frameworks).
Alternatively even some command line flags to drop certain labels from the emitted metrics would be helpful.
Already existing or connected issues / PRs (optional)
No response
Edit/Update
- I found this in the k6 documentation about how to limit the
namelabel but I'd need to test if this can also be used to edit/limit theurllabel. Though this still leaves it to the discretion of the the developer and their test, rather than allowing me to override the default behavior.
We've arrived at a solution where we use the url grouping with the tag function.
This is good enough but it leaves it to the developers to implement and I can imagine many will forget and we'll still have to chase down the offending k6 test and deal with the fallout of whatever cardinality explosion happens before we catch it.
I'd still like it if there were a way to control this functionality with an environment variable (or config file) so that we as the platform engineering team can enforce it with a kyverno policy, etc. We manage a internal k6 helm chart for the developers so we'd have the opportunity set it there.
We seem to have a similar problem. We are using browser tests with prometheus RW and it sends a lot of url metrics and url relabeling is absolutely necessary. Grouping tags in the browser seems to be impossible, and remote write_relabel_config doesn't work very well, some urls are relabeled and some are not (even with the correct regexes).
It would be great to have relabeling in k6.
Got burned by this again today 😅 I had forgotten I had made a GH issue and decided to check. Humbly asking for your consideration 🙇
Hi @jameshounshell, apologies for the delayed response; this issue unfortunately slipped off our radar.
We don't think that incorporating this feature directly into k6 is correct. As you've noted, we already offer a workaround to prevent this problem. Implementing a relabeling feature on the output would significantly impact performance, potentially affecting load generation capabilities.
From a metrics' management perspective, we generally avoid highly specialized features for individual outputs, preferring solutions that apply universally. In Grafana Cloud k6, we address this by alerting in the app, which usually resolves the issue after the first run. If Grafana Cloud k6 doesn't meet your needs, we suggest adopting a similar alerting mechanism on your platform.
If blocking ingestion is a strict requirement, we recommend exploring solutions within the Prometheus ecosystem to get the relabel configuration working. Could you elaborate on the challenges you're facing with it?
Grouping tags in the browser seems to be impossible
Regarding grouping tags in the browser, @artem-zherdiev-ingio we have that covered. There's a dedicated feature for it, and you can find more information here: https://grafana.com/docs/k6/latest/using-k6-browser/recommended-practices/prevent-too-many-time-series-error.
I hope it clarifies your expectations. I close the issue as not planned, but feel free to add more comments.