topmetrics collector causing panic.
Describe the bug
Panic with topmetrics collector enabled when connected to cloud atlas mongodb instance.
To Reproduce Steps to reproduce the behavior:
-
what parameters are being passed to
mongodb_exporter: --no-mongodb.direct-connect --collect-all -
describe steps to reproduce the issue: start mongodb_exporter with parameters above and connect to cloud atlas mongodb instance, curl metrics endpoint
Expected behavior metrics being shown
Logs
2023/06/22 13:24:25 http: panic serving 10.4.166.252:48284: descriptor Desc{fqName: "", help: "", constLabels: {}, variableLabels: []} is invalid: label value "\xf8\xc2}\xbbz\x14" is not valid UTF-8
goroutine 4026201 [running]:
net/http.(*conn).serve.func1()
/opt/hostedtoolcache/go/1.19.9/x64/src/net/http/server.go:1850 +0xb8
panic({0x6d3800, 0x400091cc90})
/opt/hostedtoolcache/go/1.19.9/x64/src/runtime/panic.go:890 +0x260
github.com/prometheus/client_golang/prometheus.(*Registry).MustRegister(...)
/home/runner/go/pkg/mod/github.com/!percona-!lab/[email protected]/prometheus/registry.go:403
github.com/percona/mongodb_exporter/exporter.(*Exporter).makeRegistry(0x40000a5e50, {0x974de0?, 0x40005d98c0}, 0x400028c8f0, {0x9725e8?, 0x40007a8ff0}, {{0x400028f8f0, 0x1, 0x1}, 0x0, ...})
/home/runner/work/mongodb_exporter/mongodb_exporter/exporter/exporter.go:214 +0xbfc
github.com/percona/mongodb_exporter/exporter.(*Exporter).Handler.func1({0x9744d0, 0x400007e000}, 0x40001ec400)
/home/runner/work/mongodb_exporter/mongodb_exporter/exporter/exporter.go:332 +0x418
net/http.HandlerFunc.ServeHTTP(0x40004c6ad8?, {0x9744d0?, 0x400007e000?}, 0x0?)
/opt/hostedtoolcache/go/1.19.9/x64/src/net/http/server.go:2109 +0x38
net/http.(*ServeMux).ServeHTTP(0x0?, {0x9744d0, 0x400007e000}, 0x40001ec400)
/opt/hostedtoolcache/go/1.19.9/x64/src/net/http/server.go:2487 +0x140
net/http.serverHandler.ServeHTTP({0x40006e5a40?}, {0x9744d0, 0x400007e000}, 0x40001ec400)
/opt/hostedtoolcache/go/1.19.9/x64/src/net/http/server.go:2947 +0x2cc
net/http.(*conn).serve(0x40005308c0, {0x974e18, 0x40002ebbf0})
/opt/hostedtoolcache/go/1.19.9/x64/src/net/http/server.go:1991 +0x544
created by net/http.(*Server).Serve
/opt/hostedtoolcache/go/1.19.9/x64/src/net/http/server.go:3102 +0x43c
Environment
- OS,
Linux 5.15.108 #1 SMP Wed May 24 23:53:57 UTC 2023 aarch64 GNU/Linux - environment (docker, k8s, etc) - EKS v1.26.4-eks-0a21954
- MongoDB version 4.4.22
Additional context mongodb_exporter version 0.39.0
Additionally this is only impacting one of our cluster (prod) and not others (dev), they both have same 4.4.22 mongodb version as well as same hardware configuration / replica count. As a workaround we are currently running mongodb_exporter with all expect topmetrics collectors enabled and so far no issue.
Hello there,
I'm also facing some issues with topmetrics.
2023/10/08 16:42:35 http: panic serving <redacted>:54398: descriptor Desc{fqName: "", help: "", constLabels: {}, variableLabels: []} is invalid: (BSONObjectTooLarge) BSONObj size: 17453485 (0x10A51AD) is invalid. Size must be between 0 and 16793600(16MB) First element: note: "all times in microseconds"
Environment
- OS: 5.14.0-284.25.1.0.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Aug 4 09:00:16 PDT 2023 x86_64 x86_64 x86_64 GNU/Linux
- MongoDB 6.0.10-1
I'm the furthest thing from a mongodb expert, but I have this same issue on one of my replica sets.
In rs1 with 3 nodes, topmetrics works perfectly on all servers. In rs2 with 15 nodes, topmetrics don't work at all from secondary servers. The same follows for running db.adminCommand({top:1}) from the mongo shell.
The point being that there's nothing the exporter can do to overcome this max size limitation in mongodb, but it could be able to detect if topmetrics are available on a node before attempting to scrape them. If it did that, it could get topmetrics from only the primary of really large replica sets (alternatively, I suppose it could just check for isMaster or isPrimary if that was updated in newer releases).
I have 0 golang experience outside of this attempt, but I tried to have AI update exporter.go to detect if topmetrics is available. It didn't work at all, and even if it did, it seems like an incomplete solution because it looks like the exporter initializes once at startup, so if/when the primary mongo rotates, we'd enter this error situation again.
I'd love to hear some suggestions on making this work, other than not collecting topmetrics.