node_exporter icon indicating copy to clipboard operation
node_exporter copied to clipboard

Even though node_intr_total metrics is of counter type, it's value reduced

Open sriharshabm opened this issue 2 years ago • 6 comments

Host operating system: output of uname -a

Linux csd01lab-ddeio-0 3.10.0-1160.15.2.el7.x86_64 #1 SMP Thu Jan 21 16:15:07 EST 2021 x86_64 x86_64 x86_64 GNU/Linux

node_exporter version: output of node_exporter --version

node_exporter, version 1.0.1

Are you running node_exporter in Docker?

No. node_exporter is run on VNF based on openstack

What did you do that produced an error?

No specific scenario.

What did you expect to see?

node_intr_total metrics value should have same value as that of last scrape interval OR increase.

What did you see instead?

node_intr_total metrics value reduced compared to previous value. image

Is it a known issue? If yes, is fix availble in recent release?

sriharshabm avatar Jan 04 '23 08:01 sriharshabm

The exporter only reports what the kernel tells it to. So this is likely a bug in your kernel version.

SuperQ avatar Jan 04 '23 08:01 SuperQ

This sounds like https://lore.kernel.org/lkml/[email protected]/

Looking at the supplied plot, it appears to have decreased by 2^32.

dswarbrick avatar Jan 06 '23 10:01 dswarbrick

Good find. I'm not sure how easy it will be to actually work around that problem, other than exposing the individual per-cpu counters directly so resets can be handled.

SuperQ avatar Jan 06 '23 11:01 SuperQ

Is this a problem? You'd rate() over this which should handle the reset, right?

discordianfish avatar Feb 02 '23 19:02 discordianfish

Is this a problem? You'd rate() over this which should handle the reset, right?

It's not quite the same as a typical counter reset, since this counter is comprised of multiple individual counters tracked by the kernel. When just one of those individual kernel 32-bit counters rolls over, it causes node_intr_total metrics to go backwards. But it hasn't wrapped around from zero, so applying a rate() or increase() to that would not be mathematically correct.

dswarbrick avatar Feb 02 '23 23:02 dswarbrick

Ugh got it.. Still, not sure what we can do on our side. Any ideas?

discordianfish avatar Mar 07 '23 12:03 discordianfish