munin
munin copied to clipboard
NetBSD irqstats plugin produces faulty output, causing storage blowout on server
Describe the bug The bundled plugins/node.d.netbsd/irqstats.in for munin-2.0.69 produces incorrect output the includes the value of an entry into the name of the entry, causing copious amounts of unique entries to appear. Was found when I was wondering exactly why munin would (rapidly) consume 8+GB for < 100 nodes. This bug easily creates 10k+ files in a few days because of that. Oh, and of course the data is utterly useless as well for the same reason.
The faulty output looks like this (short excerpt only):
intr_msix2_vec_0_____45173650.value 451736504
intr_msix2_vec_1______9604523.value 96045232
intr_msix2_vec_2_____10494045.value 104940458
intr_msix2_vec_3______6304586.value 63045862
The expected output would be this:
msix2_vec_0.value 451984907
msix2_vec_1.value 96049563
msix2_vec_2.value 105022424
msix2_vec_3.value 63064578
To Reproduce Steps to reproduce the behavior: Invoke the irqstats plugin on a sufficiently (observed with at least NetBSD 9.0, but I'm pretty sure I saw it on earlier ones too) NetBSD machine (reproduced on: i386, amd64, sparc64), see the output:
intr_msix2_vec_0_____45173650.value 451736504
intr_msix2_vec_1______9604523.value 96045232
intr_msix2_vec_2_____10494045.value 104940458
intr_msix2_vec_3______6304586.value 63045862
Expected behavior Expected output would look like this:
msix2_vec_0.value 451984907
msix2_vec_1.value 96049563
msix2_vec_2.value 105022424
msix2_vec_3.value 63064578
Screenshots & Logs If applicable, please add screenshots and/or logs to help explain your problem.
Desktop (please complete the following information): OS: NetBSD 9.1, NetBSD 9.2, NetBSD 9.3 Munin version: 2.0.69
Additional context
I've been running a rewritten version of the irqstats plugin for NetBSD for ... a few months at least now, this one works and produces the expected output:
#! /bin/sh
#
# Plugin to monitor the individual interrupt sources.
#
# Usage: Link or copy into /etc/munin/node.d/
#
# $Log: irqstats.in,v $
# Revision 1.1.1.1 2006/06/04 20:53:57 he
# Import the client version of the Munin system monitoring/graphing
# tool -- project homepage is at http://munin.sourceforge.net/
#
# This package has added support for NetBSD, via a number of new plugin
# scripts where specific steps needs to be taken to collect information.
#
# I also modified the ntp_ plugin script to make it possible to not
# plot the NTP poll delay, leaving just jitter and offset, which IMO
# produces a more telling graph.
#
#
#
# Magic markers (optional - only used by munin-config and some
# installation scripts):
#
#%# family=auto
#%# capabilities=autoconf
if [ "$1" = "autoconf" ]; then
if [ -x /usr/bin/vmstat ]; then
echo yes
exit 0
else
echo no
exit 1
fi
fi
intr_sources=$(/usr/bin/vmstat -i|grep -v Total|grep -v 'total rate'|sed -E 's/ {2,}/|/g'|sed 's/ /_/g'|grep -e '[:alnum:]'|cut -d\| -f1)
echo "intr_sources = |$intr_sources|"
# If run with the "config"-parameter, give out information on how the
# graphs should look.
if [ "$1" = "config" ]; then
echo 'graph_title Individual interrupts'
echo 'graph_args --base 1000 -l 0'
echo 'graph_vlabel interrupts / ${graph_period}'
echo 'graph_category system'
echo -n 'graph_order '
for i in $intr_sources; do
echo -n ' intr_'${i}
done
echo
for i in $intr_sources; do
# echo 'intr_'${i}'.draw LINE'
echo 'intr_'${i}'.label' `echo $i | sed -e 's/_/ /g'`
echo 'intr_'${i}'.info Interrupt' `echo $i | sed -e 's/_/ /g'`
echo 'intr_'${i}'.type DERIVE'
echo 'intr_'${i}'.min 0'
done
exit 0
fi
/usr/bin/vmstat -i|grep -v Total|grep -v 'total rate'|sed -E 's/ {2,}/|/g'|sed 's/ /_/g'|sed 's/|/ /g'|grep -E '[:alnum:]'|awk '{print $1 ".value " $2}'
If someone submits a pull request, eventually the downstream patch in NetBSD can be eliminated.