node_exporter
node_exporter copied to clipboard
Qdisc collector does not expose queues with a parent
Host operating system: output of uname -a
Linux k8s-secnet-node6 6.1.0-23-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.99-1 (2024-07-15) x86_64 GNU/Linux
node_exporter version: output of node_exporter --version
node_exporter, version 1.8.2 (branch: HEAD, revision: f1e0e8360aa60b6cb5e5cc1560bed348fc2c1895)
build user: root@e8029641b208
build date: 20240806-20:45:43
go version: go1.21.13
platform: linux/amd64
tags: unknown
node_exporter command line flags
--path.procfs=/host/proc --path.sysfs=/host/sys --web.listen-address=0.0.0.0:9100 --collector.qdisc
node_exporter log output
not relevant
Are you running node_exporter in Docker?
yes, we have correctly exposed the host /proc and /sys.
What did you do that produced an error?
We enabled qdisc metrics to correlate a networking issue with packet drops in an eBPF program, but it turns out that node_exporter only gives metrics for the qdisc that have no parent. With tc -s qdisc show we see a lot of packet drops on ebpf qdiscs (type clsact) which have a parent qdisc defined, but because node_exporter does not expose these it is very hard to correlate our networking issues with packet drops here. Looking at the implementation this is logical, since node_exporter by default skips all qdiscs that have a parent.
See https://github.com/prometheus/node_exporter/blob/b9d0932179a0c5b3a8863f3d6cdafe8584cedc8e/collector/qdisc_linux.go#L151
What did you expect to see?
I expected to see metrics for all qdiscs on the host, and to let users worry about possible cardinality issues. This collector is disabled by default anyway.
What did you see instead?
I saw only metrics for the root qdisc, which in this case is not that relevant.
Possible fixes
- also show metrics for non-root discs (remove the condition in the linked function)
- possibly do this based on a CLI flag or environment variable?