node_exporter icon indicating copy to clipboard operation
node_exporter copied to clipboard

Qdisc collector does not expose queues with a parent

Open bh-tt opened this issue 1 year ago • 2 comments

Host operating system: output of uname -a

Linux k8s-secnet-node6 6.1.0-23-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.99-1 (2024-07-15) x86_64 GNU/Linux

node_exporter version: output of node_exporter --version

node_exporter, version 1.8.2 (branch: HEAD, revision: f1e0e8360aa60b6cb5e5cc1560bed348fc2c1895)
  build user:       root@e8029641b208
  build date:       20240806-20:45:43
  go version:       go1.21.13
  platform:         linux/amd64
  tags:             unknown

node_exporter command line flags

--path.procfs=/host/proc --path.sysfs=/host/sys --web.listen-address=0.0.0.0:9100 --collector.qdisc     

node_exporter log output

not relevant

Are you running node_exporter in Docker?

yes, we have correctly exposed the host /proc and /sys.

What did you do that produced an error?

We enabled qdisc metrics to correlate a networking issue with packet drops in an eBPF program, but it turns out that node_exporter only gives metrics for the qdisc that have no parent. With tc -s qdisc show we see a lot of packet drops on ebpf qdiscs (type clsact) which have a parent qdisc defined, but because node_exporter does not expose these it is very hard to correlate our networking issues with packet drops here. Looking at the implementation this is logical, since node_exporter by default skips all qdiscs that have a parent.

See https://github.com/prometheus/node_exporter/blob/b9d0932179a0c5b3a8863f3d6cdafe8584cedc8e/collector/qdisc_linux.go#L151

What did you expect to see?

I expected to see metrics for all qdiscs on the host, and to let users worry about possible cardinality issues. This collector is disabled by default anyway.

What did you see instead?

I saw only metrics for the root qdisc, which in this case is not that relevant.

Possible fixes

  • also show metrics for non-root discs (remove the condition in the linked function)
  • possibly do this based on a CLI flag or environment variable?

bh-tt avatar Aug 20 '24 07:08 bh-tt