node_exporter icon indicating copy to clipboard operation
node_exporter copied to clipboard

Node exporter complains about unknown metric lines from NFSd on Linux 6.6-rc1

Open klausman opened this issue 2 years ago • 5 comments

Host operating system: output of uname -a

Linux 6.6.0-rc1 vanilla self compiled and running Debian otherwise

node_exporter version: output of node_exporter --version

$ ./node_exporter --version
node_exporter, version 1.6.1 (branch: master, revision: f34aaa61092fe7e3c6618fdb0b0d16a68a291ff7)
  build user:       klausman@felka
  build date:       20230911-15:50:34
  go version:       go1.21.1
  platform:         linux/amd64
  tags:             netgo osusergo static_build

node_exporter command line flags

./node_exporter --web.listen-address=:9101

node_exporter log output

ts=2023-09-11T15:53:30.969Z caller=collector.go:169 level=error msg="collector failed" name=nfsd duration_seconds=0.00018264 err="failed to retrieve nfsd stats: unknown NFSd metric line \"wdeleg_getattr\""

Are you running node_exporter in Docker?

Nope, running in host OS directly.

What did you do that produced an error?

Ran node-exporter as describe above

What did you expect to see?

No errors about unknown metric lines. Not sure if that metric line should actually become a useful exported metric.

What did you see instead?

The log message above

klausman avatar Sep 11 '23 15:09 klausman

This needs to (at least) be handled in the proc module. I have a proof of concept change here:

https://github.com/klausman/procfs/commit/9c4dcd1232831c555a6ea1c6c82b4e40be5f3bb1

klausman avatar Sep 21 '23 10:09 klausman

PR on procfs: https://github.com/prometheus/procfs/pull/574

Also, I think it's less than ideal that one new stat results in node-exporter not exporting any stats about NFSd anymore. Logging unknown stats (though spammy) is fine, but just not exporting any stats in that case is very brittle.

klausman avatar Sep 21 '23 10:09 klausman

Seeing https://github.com/prometheus/procfs/pull/574 is in, I think this can be closed now.

rexagod avatar Mar 17 '24 22:03 rexagod

Logging unknown stats (though spammy) is fine, but just not exporting any stats in that case is very brittle.

IIUC For this to happen, we'll need procfs to send metric data at-least for the set of metrics that can be successfully resolved (instead of erroring out), which we can then update the collectors of. For the ones that are empty, we can just query them in the next pass. It's necessary that all possible metric data is sent and not interrupted if a certain case errors out.

rexagod avatar Mar 17 '24 22:03 rexagod

When is this fix coming to debian repos? I'm running prometheus-node-exporter 1.5.0-1+b6 and still see the issue in logs.

josecarre avatar Apr 30 '24 06:04 josecarre