node_exporter icon indicating copy to clipboard operation
node_exporter copied to clipboard

xvda[0-9]+ disks (not partitions) are ignored by default

Open luizluca opened this issue 3 years ago • 4 comments

Host operating system: output of uname -a

Linux 5.3.18-24.64-default

node_exporter version: output of node_exporter --version

node_exporter, version (branch: , revision: ) build user:
build date:
go version: go1.14.7

node_exporter command line flags

none

Are you running node_exporter in Docker?

no

What did you do that produced an error?

default config does not collect disk stats

What did you expect to see?

all disk devices should be collected by default

What did you see instead?

No disk stats

Since #13 got merged, xvda[0-9]+ are ignored. However, it is not correct to assume that those are always partitions. Xen allows a machine to inform "virtual partition", which are mapped to dedicated devices (it makes expanding a filesystem trivial).

This is a normal xvda-disk with partitions (possibly on what that PR was based on):

# ls /sys/block/xvda* -l
lrwxrwxrwx 1 root root 0 Mai 17 14:46 /sys/block/xvda -> ../devices/vbd-51712/block/xvda
# grep xvda /proc/diskstats 
 202       0 xvda 26438 52 1897369 17235 4263982 53800 47477992 3542063 0 103104 3557922
 202       1 xvda1 22 0 176 7 0 0 0 0 0 7 7
 202       2 xvda2 118 0 13522 94 129 3 30326 274 0 42 367
 202       3 xvda3 26268 52 1880583 17122 4263853 53797 47447666 3541789 0 103052 3557536

And this is a "virtual-partitions" machine:

# ls /sys/block/xvd* -d1
/sys/block/xvda1
/sys/block/xvda2
# grep xvda /proc/diskstats 
 202       1 xvda1 66701 3 13340088 570748 1833 254 308680 72 0 32440 674096
 202       2 xvda2 101 0 4448 0 0 0 0 0 0 4 4

xvda[0-9]+ should not be excluded by default only based on their name. At least in Linux, this is the easy but wrong way to check if a device is a partition or a disk. node_exporter could check if /sys/block/ exists (not partition) of if /sys/dev/block/:/partition file exists (not a disk).

If collector.diskstats.ignored-devices, by default, only excludes partitions, I think a new option like: "collector.diskstats.ignore-partitions (default: true)" should be used instead, leaving "ignored-devices" empty by default.

luizluca avatar May 17 '21 18:05 luizluca

As a suggestion, wouldn't it be better to iterate over "/sys/block/*/stat"? It is mostly the same format. It has the same order/meaning but it looks like it uses fixed-length fields (safer? faster?).

Would a PR changing from /proc/diskstats to /sys/block/*/stat be accepted?

luizluca avatar May 19 '21 03:05 luizluca

Parsing /sys/block/*/stat might be a good idea. But some users do want partition-level stats. We use the regexp method to provide the most generic way to filter exactly what the user wants to know without second guessing them.

SuperQ avatar May 19 '21 04:05 SuperQ

It's also possible to scan /sys/block/$DEV/$DEV*/stat for partition data. But either way, this would require completely re-designing the flag layout for this collector. It also changes the consistency of the timestamps that we get from parsing /proc/diskstats

This would be a breaking change for the existing behavior. It could be implemented as a completely new collector, this way existing users would continue to see the existing behavior. We could call it --collector.blockdev or something. It could become the default implementation for diskstats in 2.x.

In order to do this, we first need a /sys/block parser in https://github.com/prometheus/procfs.

SuperQ avatar May 19 '21 04:05 SuperQ

It's also possible to scan /sys/block/$DEV/$DEV*/stat for partition data. But either way, this would require completely re-designing the flag layout for this collector. It also changes the consistency of the timestamps that we get from parsing /proc/diskstats

The "timestamp consistency" is not guaranteed as /proc/diskstats does not use locks. It is expected, although not frequent, that partitions metrics sum does not match disk one. Reading individual files might increase this discrepancy but I think not to a unacceptable magnitude.

This would be a breaking change for the existing behavior. It could be implemented as a completely new collector, this way existing users would continue to see the existing behavior. We could call it --collector.blockdev or something. It could become the default implementation for diskstats in 2.x.

In order to do this, we first need a /sys/block parser in https://github.com/prometheus/procfs.

I created a PR proposing the change in two steps. The first one will not change the expected output (except for a minor delay between devices). The second one might break when someone was collecting partition data.

luizluca avatar May 19 '21 23:05 luizluca