graphios icon indicating copy to clipboard operation
graphios copied to clipboard

Parsing of performance data

Open druchoo opened this issue 9 years ago • 7 comments

Given the format of perf data: 'label'=value[UOM];[warn];[crit];[min];[max] as described in the plugin guidelines... I have a few questions/issues.

  1. Labels with single quotes and/or spaces have varying results.
    • This one throws an error. Not sure if it's the spaces or the non alpha chars:

      • raw perf data:

      'Intel(R) PRO/1000 MT Network Connection-QoS Packet Scheduler-0000_in_prct'=0%;8000;9000;0;100 'Intel(R) PRO/1000 MT Network Connection-QoS Packet Scheduler-0000_out_prct'=0%;9000;10000;0;100 'Intel(R) PRO/1000 MT Network Connection-QoS Packet Scheduler-0000_speed_bps'=1000000000 'Intel(R) PRO/1000 MT Network Connection_in_prct'=0%;8000;9000;0;100 'Intel(R) PRO/1000 MT Network Connection_out_prct'=0%;9000;10000;0;100 'Intel(R) PRO/1000 MT Network Connection_speed_bps'=1000000000

      • graphios.log:

      March 05 22:46:38 graphios.py CRITICAL failed to parse label: 'MT' part of perfstring 'Intel(R) PRO_1000 MT Network Connection-QoS Packet Scheduler-0000_in_prct=0%;8000;9000;0;100 Intel(R) PRO_1000 MT Network Connection-QoS Packet Scheduler-0000_out_prct=0%;9000;10000;0;100 Intel(R) PRO_1000 MT Network Connection-QoS Packet Scheduler-0000_speed_bps=1000000000 Intel(R) PRO_1000 MT Network Connection_in_prct=0%;80

    • When labels have quotes ('somelabel'), the graphite metric keeps the quote.

      • raw perf data:

      'pid'=23466 heap=692439KB;;;;1048576 'heap ratio'=66%;80;90 perm=114665KB;;;;524288 perm_ratio=21%;80;90

      • whisper files. note pid has the quotes and 'heap ratio' was split and has the end quote:
      # ls -l ../MemoryStatistics-Tomcat/
      total 1.5M
      -rw-r--r-- 1 carbon carbon 204K Mar  5 23:05 'pid'.wsp
      -rw-r--r-- 1 carbon carbon 204K Mar  5 23:05 heap.wsp
      -rw-r--r-- 1 carbon carbon 204K Mar  5 23:05 perm.wsp
      -rw-r--r-- 1 carbon carbon 204K Mar  5 23:05 perm_ratio.wsp
      -rw-r--r-- 1 carbon carbon 204K Mar  5 23:05 ratio'.wsp
      
  2. Does graphios just drop UOM, warn, crit, min, max?

The obvious workaround is to not use single qoutes, spaces, or non alpha chars in labels.

BTW, thank you for all the hard work!

druchoo avatar Mar 05 '15 23:03 druchoo

Looks like this is partly a dupe of #54, however, space is a valid char for label. In which case it has to be single quoted. This seems to be widely used in NSClient++ perf data.

2. label can contain any characters except the equals sign or single quote (')
3. the single quotes for the label are optional. Required if spaces are in the label

druchoo avatar Mar 08 '15 13:03 druchoo

I have tested the raw perfdata you gave here against this new code, can you give that a try and let me know if it works for you? (branch: parserfix)

shawn-sterling avatar Mar 09 '15 01:03 shawn-sterling

Still the same. Unfortunately it doesn't look like the quotes are written to the perfdata file.

$ grep -m 1 'Average CPU load' service-perfdata
DATATYPE::SERVICEPERFDATA       TIMET::1427049809       HOSTNAME::somehost.com        SERVICEDESC::Average CPU load   SERVICEPERFDATA::10 min avg Load=1%;80;95;0;100 60 min avg Load=0%;80;95;0;100 1440 min avg Load=1%;80;95;0;100 SERVICECHECKCOMMAND::check_nt_cpuload!10,80,95,60,80,95,1440,80,95      HOSTSTATE::UP   HOSTSTATETYPE::HARD     SERVICESTATE::OK        SERVICESTATETYPE::HARD  GRAPHITEPREFIX::$_SERVICEGRAPHITEPREFIX$        GRAPHITEPOSTFIX::$_SERVICEGRAPHITEPOSTFIX$

But they are there!

# check_nt -t 10 -H somehost.com -p 47956 -v CPULOAD -l 10,80,95,60,80,95,1440,80,95
CPU Load 2% (10 min average) 3% (60 min average) 2% (1440 min average) |   '10 min avg Load'=2%;80;95;0;100 '60 min avg Load'=3%;80;95;0;100 '1440 min avg Load'=2%;80;95;0;100

... Well found the issue:

# ILLEGAL MACRO OUTPUT CHARACTERS
# This option allows you to specify illegal characters that are
# stripped from macros before being used in notifications, event
# handlers, etc.  This DOES NOT affect macros used in service or
# host check commands.
# The following macros are stripped of the characters you specify:
#       $HOSTOUTPUT$
#       $HOSTPERFDATA$
#       $HOSTACKAUTHOR$
#       $HOSTACKCOMMENT$
#       $SERVICEOUTPUT$
#       $SERVICEPERFDATA$
#       $SERVICEACKAUTHOR$
#       $SERVICEACKCOMMENT$

# illegal_macro_output_chars=`~$&|'"<>
illegal_macro_output_chars=`~$&'"<>

I'll have to test removing the single quote (') from illegal_macro_output_chars in QA. Not sure what that will break.

druchoo avatar Mar 22 '15 18:03 druchoo

I can confirm that using shlex.split() works in my environment for checks where perfdata is of the form 'label with space'=value, such as those from nsclient++. Please consider merging this.

acobaugh avatar Jan 13 '16 21:01 acobaugh

Sorry I haven't had time to test in my QA env and now I no longer have the ability to do so.

@phalenor can provide your illegal_macro_output_chars?

druchoo avatar Jan 14 '16 20:01 druchoo

Here:

illegal_macro_output_chars=`~$&|'"<>

I wonder what the differences are between our installs. I'm running nagios core 4.1.1. I am definitely getting single quotes in (HOST|SERVICE)PERFDATA.

I wonder if this is relevant: http://tracker.nagios.org/view.php?id=13

acobaugh avatar Jan 14 '16 20:01 acobaugh

Looks to be exactly that. Unfortunately for me I'm running 3.51 and don't plan on upgrading anytime soon. I think removing single quote from illegal_macro_output_chars should fix this issue but again not sure of any side effects. I'll leave up to @shawn-sterling to merge or not.

druchoo avatar Jan 14 '16 21:01 druchoo