node-exporter-textfile-collector-scripts
node-exporter-textfile-collector-scripts copied to clipboard
md_info_detail.sh: `CheckStatus` is not suitable as a label
Example:
node_md_info{md_device="md127", ... CheckStatus="75% complete", ...} 1
This value keeps changing, so we get 100 different timeseries as it goes from 0% to 100%. Although it's nice to be able to see this, I think it's label abuse.
Here's the raw mdadm --detail output:
# mdadm --detail /dev/md127
/dev/md127:
Version : 1.2
Creation Time : Wed Mar 15 14:32:11 2017
Raid Level : raid10
Array Size : 39069470720 (37259.55 GiB 40007.14 GB)
Used Dev Size : 7813894144 (7451.91 GiB 8001.43 GB)
Raid Devices : 10
Total Devices : 11
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Mon Dec 9 18:13:48 2019
State : clean, checking
Active Devices : 10
Working Devices : 11
Failed Devices : 0
Spare Devices : 1
Layout : near=2
Chunk Size : 4096K
Check Status : 75% complete
Name : 127
UUID : 00993e52:41fc1f0d:1f4457b0:a748d110
Events : 618588
Number Major Minor RaidDevice State
0 8 16 0 active sync set-A /dev/sdb
1 8 32 1 active sync set-B /dev/sdc
2 8 96 2 active sync set-A /dev/sdg
3 8 128 3 active sync set-B /dev/sdi
4 8 112 4 active sync set-A /dev/sdh
5 8 48 5 active sync set-B /dev/sdd
6 8 0 6 active sync set-A /dev/sda
7 8 144 7 active sync set-B /dev/sdj
8 8 160 8 active sync set-A /dev/sdk
9 8 64 9 active sync set-B /dev/sde
10 8 80 - spare /dev/sdf
Another problem is with "Rebuild Status" also being exposed as a label:
# mdadm --detail /dev/md127
...
Consistency Policy : resync
Rebuild Status : 6% complete
Name : 127
UUID : c11df176:38c7bc39:48bb4c7f:47a8e782
Events : 13302
...
resulting in label RebuildStatus="6% complete"