graphios
graphios copied to clipboard
Allow metrics to be written via python plugin
Hey Shawn,
My name is Connor. I am an intern at iPlant Collaborative, we're an NSF funded organization specializing in bio-cyber-infrastructure to democratize access to U.S. supercomputing capabilities. (we help scientists leverage the cloud)
With my monitoring setup, I have a nagios plugin reporting metrics from a hypervisor. As a result the output concerns all the hypervisor's vm's. Instead of returning metrics for a single host, I'm returning stats for multiple hosts. Here's sample output from my plugin:
virt-stats OK - 2 running of 7 on <hypervisor> | <vm 1>=<cpu>;<mem>;<tx>;<rx>; <vm 2> ...
The metrics I would like to send to carbon:
stats.<hypervisor>.<vm 1>.cpu
...
stats.<hypervisor>.<vm 2>.cpu
...
Currently Graphios doesn't permit this flexibility, which is understandable. I need a dynamic graphite prefix.
I have implemented this behavior. The user simply has to write a python module defining get_metrics
. The user is responsible for ensuring that the returned path is valid for their given backend. Here's a sample plugin:
def get_metrics(perfdata, nag):
"""
returns a [(<path>, <value>)] where each is the metric to send to carbon
"""
path = "%s.%s.%s.%s" % (nag.GRAPHITEPREFIX, nag.HOSTNAME, nag.GRAPHITEPOSTFIX, nag.LABEL)
value = nag.VALUE
return [ (path, value) ]
The nagios service definition looks like so:
define service {
...
_graphiteplugin /path/to/plugin.py
...
}
So far it has been tested with carbon. If this behavior looks like something you want to be a part of graphios, let me know what else you'd like to see.
Hi there. First of all, I think it's pretty badass that your company is letting you work on open source, and second of all I think it's also pretty badass that you are actually doing it. :)
I don't mind this idea, but I think you might run into problems down the road with nagios depending on what you are using to receive data. nrpe and nsca have tiny payload sizes for perfdata by default; you can modify the source and increase to 4096 bytes, but I've run into problems with that as well. If you are using nrdp this shouldn't be a problem.
I think a better way to introduce this would be as a config file option; then the codepath would could be skipped unless it was purposefully enabled.
The loading of the modules will likely need some more exceptions added, specifically IOError and OSError in case the permissions are messed up for the module.
The code at line 359 -> 363 assumes the data returned is in the format you expect, this may cause problems if it isn't.
removing the clearing of whitespace you did on what was line 387 is likely detrimental for most people running this. There is also another merge I'm going to do (parserfix) that will be cleaning up any spaces in the perfdata (windows plugins and their spaces). Not sure what the right solution is here, you may consider something else to separate entries (space may work for you but won't for others).
Lastly grab the python module flake8 if you want travisci to stop complaining. :)
Have a good day.
-Shawn
Config is moved out of nagios definitions. (Do you want command line opts for this?)
I added better error handling reporting, 'missing path/missing method/see example'.
In terms of the white-space substitution, I changed it so that if you go the plugin route no substitutions are made. If you're writing a plugin you have complete flexibility (and complete freedom to shoot yourself in the foot).
This can easily be included with other backends. From a cursory look, librato
, influxdb
would use the same fix as carbon
(perhaps a case for base classing some of the backends). I don't have a convenient way to test those systems.
Should be flake8 compatible :).