telegraf icon indicating copy to clipboard operation
telegraf copied to clipboard

Expected udev properties cannot be used in diskio input

Open cornerot opened this issue 7 years ago • 8 comments

Bug report

Relevant telegraf.conf:

[[inputs.diskio]] device_tags = ["DEVTYPE", "ID_FS_TYPE"]

System info:

Telegraf v1.5.0 (git: release-1.5 a1668bbf) CentOS Linux release 7.4.1708 (Core)

Steps to reproduce:

  1. Add "device_tags = ["DEVTYPE", "ID_FS_TYPE"]" to the [[inputs.diskio]] section
  2. udevadm info -q property -n /dev/sda1 | egrep 'DEVTYPE|ID_FS_TYPE'
DEVTYPE=partition
ID_FS_TYPE=linux_raid_member
  1. telegraf --test | grep sda1 | sed 's|,|,\n|g;s| |\n|g'
2018/01/12 12:15:24 I! Using config file: /etc/telegraf/telegraf.conf
>
diskio,
name=sda1,
ID_FS_TYPE=linux_raid_member,
host=example.com
reads=1810738130i,
read_bytes=44825536246784i,
write_time=502432468i,
io_time=196321694i,
iops_in_progress=0i,
writes=158223504i,
write_bytes=1859747491328i,
read_time=1446620992i,
weighted_io_time=1948537468i
1515759325000000000

Expected behavior:

DEVTYPE and ID_FS_TYPE tags in the telegraf output

Actual behavior:

ID_FS_TYPE only

Additional info:

telegraf ignoring all device_tags in the [[inputs.diskio]] section which does not contain underscore: DEVNAME, DEVTYPE, SUBSYSTEM, etc.

cornerot avatar Jan 12 '18 12:01 cornerot

It looks like the suggest udevadm command shows more fields than we can actually get from the filesystem, here is how we get these fields:

Get the major/minor device numbers:

$ ls -l /dev/vda
brw-rw---- 1 root disk 254, 0 Jan 11 17:26 /dev/vda

Then we use this file as source of the device_tags:

$ cat /run/udev/data/b254\:0

I'm not sure if there is a good source of these properties outside of this method, other than running udevadm which would be too slow.

danielnelson avatar Jan 12 '18 20:01 danielnelson

Although this data is cached, so maybe we could use udevadm.

danielnelson avatar Jan 12 '18 20:01 danielnelson

I wouldn't recommend caching udevadm, as the device could get replaced between polling cycles, and you wouldn't know. It should be easy instead to just calculate the DEV* fields.

  • DEVTYPE comes from any of the s fields in the udev file (the value up to the first /).
  • DEVNAME comes from reading the symlink of any of the s fields.
  • DEVLINKS is the values of the s fields
  • DEVPATH i'm not sure about.

Not including these was actually deliberate when I wrote the device tagging stuff, as I didn't think they'd serve any use considering that DEVTYPE should always be disk as this is the "diskio" plugin, the DEVNAME is already added as the name tag, and neither DEVLINKS nor DEVPATH seemed like they'd be meaningful.

phemmer avatar Jan 13 '18 06:01 phemmer

Sure, DEVLINKS and DEVPATH not meaningful. But DEVTYPE can be not disk only, but also partition. DEVTYPE can be read from /sys/dev/block/[MAJOR:MINOR]/uevent:

cat /sys/dev/block/8:1/uevent
MAJOR=8
MINOR=1
DEVNAME=sda1
DEVTYPE=partition
PARTN=1

cornerot avatar Jan 15 '18 11:01 cornerot

We currently cache by devname (i.e.: sda1), so I'm sure we already have the caching bug. It seems like we could probably cache safely if we used the right property, maybe major + minor + usec_initialized?

I think it would be nice to use udevadm here, that way we don't have to have any asterisks in the docs explaining why some property doesn't exist.

Side note, the function we call on each interval in gopsutil already runs this udevadm command for each device found without caching to get the serial number.

danielnelson avatar Jan 17 '18 03:01 danielnelson

Pinging this one, as I have felt on this today with the missing DEVTYPE information from the diskio plugin. Worth noting it would also close #4238 (obviously, I also have the same need, and got tired of a long regex to remove partitions)

This aside, @danielnelson, udevadm just collects existing information. A property can exist on a given dev type, but could be missing on another dev type (for example, any dm-* usually lacks the ID_* properties) . So, either polling udevadm, or directly from the source as reported by @cornerot, located at /sys/dev/block/[major]:[minor]/uevent

Daryes avatar Oct 11 '18 21:10 Daryes

Hi,

Looking through older bugs to see if any action is still required.

Based on the discussion it does seem like adding DEVTYPE could be helpful as a tag to distinguish between disk and partition. Here is an example on my system:

$ grep DEVTYPE /sys/dev/block/*/uevent
/sys/dev/block/259:0/uevent:DEVTYPE=disk
/sys/dev/block/259:1/uevent:DEVTYPE=disk
/sys/dev/block/259:2/uevent:DEVTYPE=partition
/sys/dev/block/259:3/uevent:DEVTYPE=partition
/sys/dev/block/259:4/uevent:DEVTYPE=partition
/sys/dev/block/259:5/uevent:DEVTYPE=partition
/sys/dev/block/259:6/uevent:DEVTYPE=partition
/sys/dev/block/259:7/uevent:DEVTYPE=partition
/sys/dev/block/259:8/uevent:DEVTYPE=partition

powersj avatar Jun 07 '22 20:06 powersj

I just wanted to add that I've spent WAY too long on this one as well. My big problem was that my docker didn't have privledge=true in the compose. As soon as I added that, everything worked as expected.

eskemojoe007 avatar Aug 09 '23 02:08 eskemojoe007

@cornerot and @eskemojoe007 please test the binary in PR #15003 available once CI finished the tests. Let me know if this fixes the issue!

srebhan avatar Mar 15 '24 13:03 srebhan