munin
munin copied to clipboard
diskstats plugin only accepts storage device kernel names
The diskstats plugin can be configured to use only a subset of all available disks in the machine. However in my machine not all disks are named in a static fashion. After each reboot the names (/dev/sdX) might be permuted as the kernel applies names according to the driver loading order. One cannot change the name using udev (this is only possible for network devices). I added therefore symlinks for individual hard disks.
With the plugins I use I did not get into any trouble except with the diskstats plugin. This plugin configuration need the kernel names (sdX) in the configuration because it reads directly from /proc or /sys. Thus the data of different disks might be intermixed when the machine is rebooted (and the devices get renamed).
An alternative would be to use other unique identification methods (like e.g. UUID) to distinguish the disks.
Munin version: 2.0.26-4 OS: Archlinux
Probably the following could be a reasonable approach:
- for relative device names (without a slash): keep the current behaviour
- for absolute device paths:
- find the major / minor number of a given device (e.g.
printf "%d %d" $(stat -L -c "0x%t 0x%T" DEVICE_PATH) - extract the matching line in /proc/diskstats based on major / minor (first and second field)
- use the third field of this /proc/diskstats line (the "real" device name) for the final lookup below /sys/disk/
- find the major / minor number of a given device (e.g.
Maybe someone would like to turn this into perl code?
This should work sufficiently. Unfortunately I am not a perl programmer. I would have to work myself into the language first... So if there is anyone out there that can help to code it, this would be really nice.
I'm not fluent in Perl myself, but have you considered using diskstat_ instead? It's not as neat, but might be a better fit for your scenario. You can also combine them into a single graph afterwards.
The problems keeps the same: Both plugins seem to look only at the proc/sysfs which contains only the kernel name and no additional (persistent) name. At least I was not able to get it running with diskstat_.
The UUID in the label would easily scare the devil himself and carry no meaning unless you also have a mount map or something. Of course a mount map, or alternative names can be supplied in .info such as in this example: http://guide.munin-monitoring.org/en/latest/reference/plugin.html#configuration-run
I think the UUID should not be that problematic. At least on my installation /dev/disk/by-uuid contains links with the name of the UUID that point to a canonical file under /dev like sda1 or similar. So I think it would be sufficient to
- allow names containing directory names to be encoded somehow
- read the corresponding link recursively to obtain a canonical name as before
- use the previous code to extract the information from
/sys//proc.
Or use the suggested solution from @sumpfralle.
The problem I have with UUIDs is that they're not unique in systems that bundle together disks. This includes md raids, LVM PVs, and btrfs arrays.
NAME FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT
sda linux_raid_member 8106d0c5-251f-de74-5595-54843d645bf6
└─md127 LVM2_member WTxyj6-QJ0E-Ad7B-C5Be-zWk8-ZjNa-pxLP1h
└─storage1-storage1 ext4 e04fcbe2-b39e-4021-908c-ebbbd537add2 218.9G 86% /
sdb linux_raid_member 8106d0c5-251f-de74-5595-54843d645bf6
└─md127 LVM2_member WTxyj6-QJ0E-Ad7B-C5Be-zWk8-ZjNa-pxLP1h
└─storage1-storage1 ext4 e04fcbe2-b39e-4021-908c-ebbbd537add2 218.9G 86% /
sdc linux_raid_member 8106d0c5-251f-de74-5595-54843d645bf6
└─md127 LVM2_member WTxyj6-QJ0E-Ad7B-C5Be-zWk8-ZjNa-pxLP1h
└─storage1-storage1 ext4 e04fcbe2-b39e-4021-908c-ebbbd537add2 218.9G 86% /
sdd linux_raid_member 8106d0c5-251f-de74-5595-54843d645bf6
└─md127 LVM2_member WTxyj6-QJ0E-Ad7B-C5Be-zWk8-ZjNa-pxLP1h
└─storage1-storage1 ext4 e04fcbe2-b39e-4021-908c-ebbbd537add2 218.9G 86% /
sde btrfs usbdrive1 a6864940-727c-4740-a20d-1f37a202006b 3T 60% /usbdrive1
sdf btrfs usbdrive1 a6864940-727c-4740-a20d-1f37a202006b
sdg linux_raid_member 8106d0c5-251f-de74-5595-54843d645bf6
└─md127 LVM2_member WTxyj6-QJ0E-Ad7B-C5Be-zWk8-ZjNa-pxLP1h
└─storage1-storage1 ext4 e04fcbe2-b39e-4021-908c-ebbbd537add2 218.9G 86% /
sdh
└─sdh1 ext2 drivespaceboot f7ce45e5-0dc5-442b-bfec-fb530654374a
# ls -l /dev/disk/by-uuid/
total 0
lrwxrwxrwx 1 root root 9 Jan 5 03:31 a6864940-727c-4740-a20d-1f37a202006b -> ../../sdf
lrwxrwxrwx 1 root root 10 Dec 17 22:36 e04fcbe2-b39e-4021-908c-ebbbd537add2 -> ../../dm-0
lrwxrwxrwx 1 root root 10 Jan 5 03:32 f7ce45e5-0dc5-442b-bfec-fb530654374a -> ../../sdh1
As you can see, I have a lot more disks than UUIDs.
I went digging through the diskstats plugin code and I noticed that there is already code in there to rename LVM volumes so that the name is displayed instead of the dm-0, dm-1, or whatever device. I thought this sounds a lot like what I want. I want my physical devices enumerated, but then named/stored in the database using stable names so it doesn't matter how my drives rearrange themselves as sda, sdb, and so on.
I searched around for an identifier I could use, and WWN seemed to be the only choice. UUID is repeated across all drives in my arrays. Other IDs change depending on physical connection. WWN is tied to the disk drive itself, so that's what I went with.
# ls -l /dev/disk/by-id/ | grep wwn
lrwxrwxrwx 1 root root 9 Jan 5 03:31 wwn-0x5000c50086f7c8bb -> ../../sde
lrwxrwxrwx 1 root root 9 Jan 5 03:31 wwn-0x5000c50087307d57 -> ../../sdf
lrwxrwxrwx 1 root root 9 Jan 5 03:31 wwn-0x50014ee202a8cc1b -> ../../sdd
lrwxrwxrwx 1 root root 9 Jan 5 03:32 wwn-0x50014ee257fe401a -> ../../sdg
lrwxrwxrwx 1 root root 9 Jan 5 03:31 wwn-0x50014ee2ad53b089 -> ../../sdb
lrwxrwxrwx 1 root root 9 Jan 5 03:31 wwn-0x50014ee2ad53b265 -> ../../sda
lrwxrwxrwx 1 root root 9 Jan 5 03:31 wwn-0x50014ee2ad53b33b -> ../../sdc
I should make it very clear I'm not a programmer. However, I hacked together a simple function that accepts a device name and returns the WWN. Here it is being handed all devices one at a time, and the function is returning WWNs when it can:
# ./diskwwn.pl | grep sd
for sda WWN name is wwn-0x50014ee2ad53b265
for sdb WWN name is wwn-0x50014ee2ad53b089
for sdc WWN name is wwn-0x50014ee2ad53b33b
for sdd WWN name is wwn-0x50014ee202a8cc1b
for sde WWN name is wwn-0x5000c50086f7c8bb
for sdf WWN name is wwn-0x5000c50087307d57
for sdg WWN name is wwn-0x50014ee257fe401a
for sdh WWN name is sdh
I then plugged that into the diskstats code just below the LVM "pretty" renaming portion so that my code could have a chance to rename too. Here's the result:

The "wwn" drives you see are sda through sdg, but renamed with stable names that should stick to that physical drive no matter how it is physically connected and no matter what sd* device name it gets. The one item listed as "sdh" is a USB thumb drive that does not have WWN support so it falls back to the sd* name. The "storage1" item is an LVM volume renamed to a "pretty name" by the existing code.
--- diskstats 2019-01-04 20:11:19.334747113 -0600
+++ diskstats2 2019-01-12 00:09:06.597150981 -0600
@@ -480,6 +480,8 @@
if ( $device =~ /^dm-\d+$/ ) {
$pretty_device = translate_devicemapper_name($device);
+ } elsif ( $device =~ /^sd[a-z]+$/ ) {
+ $pretty_device = translate_dev_to_wwn($device);
}
$pretty_device ||= $device;
@@ -580,6 +582,28 @@
return %diskstats;
}
+# translate_dev_to_wwn
+#
+# Tries to find a WWN based on a device, such as "sda"
+# Returns either a resolved WWN or the original devicename
+
+sub translate_dev_to_wwn {
+ my ($device) = @_;
+
+ my $rdev = ( stat("/dev/" . $device) )[6];
+
+ my @wwndevices = glob "/dev/disk/by-id/wwn-*";
+
+ for my $entry (@wwndevices) {
+ if($rdev == ( stat($entry) )[6]) {
+ return basename($entry);
+ }
+ }
+
+ # Return original string if the wwn can't be found.
+ return $device;
+}
+
# translate_devicemapper_name
#
# Tries to find a devicemapper name based on a minor number
Someone who is a real programmer and knows what they're doing, feel free to take this and run with it. There very well may be egregious errors in the code above, but I think it gives us something to work with and at the very least serves as a proof of concept.
One thing that may be necessary is to add functionality so this renaming is optional instead of applied to all sd* devices that have a WWN present.
Another idea would be to put an alias using udev. The problem that arises here is (independent of diskstats vs diskstat_) that the plug-in did not handle the symlink to the device name well. Maybe the code of @mattbuford can help here.
@christianlupus Great idea to use udev. I have gone with this direction. It was actually a trivial change to the patch I had written, really just looking in another directory for the alias symlinks instead of the wwn symlinks. The real work was over in the udev rules.
I like this because:
- it frees users to key in on anything they want when aliasing their disks. It can be physical drive bay, or drive serial number, or whatever you want. udev is super flexible.
- You get to use a nice friendly name in your alias of whatever you want instead of the annoyingly long WWNs.
- It doesn't change the drive name at all unless you create an alias in the specific directory, so users who don't want this won't suddenly see all their disks suddenly monitored via wwn instead. Users see the usual sda sdb sdc unless they add udev rules.
- It works even on drives that don't support WWN (such as my USB thumb drive).
I chose to key in on serial numbers for my udev rules, but you can do all kinds of crazy things here to name devices based on whatever you want. Here is an example /etc/udev/rules.d/70-disk-alias.rules file:
SUBSYSTEM!="block", GOTO="end"
KERNEL=="sd*", ENV{ID_SERIAL}=="SanDisk_U3_Cruzer_Micro_45269107A4E0FAF2-0:0", SYMLINK+="disk/by-alias/usbboot"
KERNEL=="sd*", ENV{ID_SERIAL}=="WDC_WD10EADS-00L5B1_WD-WCAU49167695", SYMLINK+="disk/by-alias/array1disk1"
KERNEL=="sd*", ENV{ID_SERIAL}=="WDC_WD10EADS-00L5B1_WD-WCAU49026339", SYMLINK+="disk/by-alias/array1disk2"
KERNEL=="sd*", ENV{ID_SERIAL}=="WDC_WD10EADS-00L5B1_WD-WCAU49023344", SYMLINK+="disk/by-alias/array1disk3"
KERNEL=="sd*", ENV{ID_SERIAL}=="WDC_WD10EADS-00L5B1_WD-WCAU48984610", SYMLINK+="disk/by-alias/array1disk4"
KERNEL=="sd*", ENV{ID_SERIAL}=="WDC_WD10EADS-00L5B1_WD-WCAU49168573", SYMLINK+="disk/by-alias/array1disk5"
KERNEL=="sd*", ENV{ID_SERIAL}=="ST8000AS0002-1NA17Z_Z84099NA", SYMLINK+="disk/by-alias/array2disk1"
KERNEL=="sd*", ENV{ID_SERIAL}=="ST8000AS0002-1NA17Z_Z8409TK2", SYMLINK+="disk/by-alias/array2disk2"
LABEL="END"
The serial numbers above can be found like this:
# udevadm info --query=property --path=/sys/block/sda | grep "^ID_SERIAL="
ID_SERIAL=WDC_WD10EADS-00L5B1_WD-WCAU49167695
# udevadm info --query=property --path=/sys/block/sdh | grep "^ID_SERIAL="
ID_SERIAL=SanDisk_U3_Cruzer_Micro_45269107A4E0FAF2-0:0
The resulting alias symlinks:
# ls -l /dev/disk/by-alias/
total 0
lrwxrwxrwx 1 root root 9 Jan 14 20:44 array1disk1 -> ../../sda
lrwxrwxrwx 1 root root 9 Jan 14 20:45 array1disk2 -> ../../sdb
lrwxrwxrwx 1 root root 9 Jan 14 20:45 array1disk3 -> ../../sdc
lrwxrwxrwx 1 root root 9 Jan 14 20:45 array1disk4 -> ../../sdd
lrwxrwxrwx 1 root root 9 Jan 14 20:46 array1disk5 -> ../../sdg
lrwxrwxrwx 1 root root 9 Jan 14 20:45 array2disk1 -> ../../sde
lrwxrwxrwx 1 root root 9 Jan 14 20:45 array2disk2 -> ../../sdf
lrwxrwxrwx 1 root root 9 Jan 14 20:46 usbboot -> ../../sdh
Updated patch:
--- diskstats 2019-01-04 20:11:19.334747113 -0600
+++ diskstats3 2019-01-14 20:56:27.616224925 -0600
@@ -480,6 +480,8 @@
if ( $device =~ /^dm-\d+$/ ) {
$pretty_device = translate_devicemapper_name($device);
+ } elsif ( $device =~ /^sd[a-z]+$/ ) {
+ $pretty_device = translate_dev_to_alias($device);
}
$pretty_device ||= $device;
@@ -580,6 +582,28 @@
return %diskstats;
}
+# translate_dev_to_alias
+#
+# Tries to find an alias based on a device, such as "sda"
+# Returns either a resolved alias or the original devicename
+
+sub translate_dev_to_alias {
+ my ($device) = @_;
+
+ my $rdev = ( stat("/dev/" . $device) )[6];
+
+ my @aliasdevices = glob "/dev/disk/by-alias/*";
+
+ for my $entry (@aliasdevices) {
+ if($rdev == ( stat($entry) )[6]) {
+ return basename($entry);
+ }
+ }
+
+ # Return original string if the alias can't be found.
+ return $device;
+}
+
# translate_devicemapper_name
#
# Tries to find a devicemapper name based on a minor number
Example output:

In this graph, storage1/storage1 is an LVM volume, and everything else is a physical drive getting a nice name from the patch's aliasing.
@mattbuford Awsome work!
This is exactly as I thought it should work. I am honest, that I do not (yet) have the /dev/disk/by-alias folder, but I assume, that udev will create it if need arises. I will have to try it out on my installation but I think this should be sooner or later put into the current branch as a pull request ;-).
Just a note: I think the names in /dev/disk/by-id should become the default for the diskstats plugins..
Agree, this plugin should at least accept /dev/disk/by-id/* devices to provide consistent and meaningful output on machines with more drives.
@mattbuford would you be willing to create a PR with your patch? For me it seems to work in a short test.
Changing the default to parse /dev/disk/by-id/ might cause the problem of inconsistencies with previous installations. Also, this can be done independently in a separate patch (maybe with some configuration to keep with the old behavior). Nevertheless, this patch should not break anything and thus can be included without much worries as far as I understand it's working principle.
Sure, I'll give it a try. I think the best way to do this is through an ENV variable added to the plugins config file that enables this feature. That way the behavior, and especially data store name, doesn't change unexpectedly on existing installations. I believe that this can be done as a global variable, so that it only has to be enabled in the config once and and then that same config section can be referenced to enable support as I add this friendly drive name support to other disk monitoring plugins (assuming I can figure it out and get around to it).
If the ENV variable is set, it will check for an alias and use that as the first priority. If that doesn't exist, it will fall back to WWN. If that doesn't exist, it will use the regular sda style name.
I did some playing around with WWN, and from my limited home sample set this seems to work reliably for SATA drives. For USB enclosures/docks, it is very hit-or-miss. Some enclosures don't provide WWN at all. Some multi-drive enclosures provide the same WWN (and manufacturer and serial number) for every drive in the enclosure, resulting in only one drive getting a WWN symlink and the rest just not having any. So, the results of using WWN with USB enclosures might not be perfect, but it won't be any worse than randomly moving sda sdb sdc names.
Example disk-to-name code test preferring manual created alias, then automatic WWN, then sd? name on my NAS:
sda array1disk4
sdb array1disk2
sdc array2disk1
sdd array2disk2
sde array1disk1
sdf array1disk3
sdq usbboot
sdj array2disk3
sdk wwn-0x50014ee1ad4a27f0
sdm sdm
sdl sdl
sdn sdn
sdo wwn-0x50014ee257fe401a
sdp wwn-0x50014ee257c457c8
sdh sdh
sds wwn-0x50014ee2ad53b265
sdr wwn-0x50014ee1ac79138c
sdt wwn-0x5000cca224c2a1c9
sdu sdu
sdw sdw
I'll work on figuring out the ENV variable enabling and create a PR once this seems to be working.