graphite-collectors icon indicating copy to clipboard operation
graphite-collectors copied to clipboard

New F5 metrics, cluster support and looping/service support

Open zonywhoop opened this issue 10 years ago • 5 comments

The changes here add the ability for the f5 collector to run as a service in a loop given a set time interval. This drastically reduces system load and makes the metric collection timing more consistent.

This also adds the ability to only collect certain metrics from the active node in a cluster vs collecting all metrics all the time.

Also added code to dig into a pools members and send in metrics for those (e.g. current sessions/connections) etc.

Additionally, stat recursion logic has been moved into a function, names are now "cleaned" (forward slashes and double .'s removed).

zonywhoop avatar Jul 31 '15 17:07 zonywhoop

Hi, @zonywhoop. I've pushed a long overdue update of the F5 collector to the repository. Some of the enhancements have overlap with your pull request. Specifically, pool member statistics are now included (along with several other new categories of statistics).

I do like the idea of being able to run in a continuous polling mode. I would like to potentially find a way to incorporate this into the current code base.

Similarly, the ability to dynamically skip certain metrics or metric categories based upon which node is active in an HA cluster is pretty neat. Does this play nice with traffic groups? I would personally like to have command line options that indicate what gets skipped on the inactive member.

mhite avatar Aug 07 '15 22:08 mhite

Hey @mhite, I looked over the code change you just committed up and there are definitely a good bit of changes. I can definitely refactor my code a bit with the new base but it will take some time. One of the new things I added was function that loops over the handed dictionary with the STATISTICS_* indexes/values from the F5. In conjunction you can also pass it a separate list that contains the values you want to capture or null to capture all. The function then creates the graphite metric dictionary for that set of statistics and returns it. This really simplified/reduced the code for processing each set of metrics, for me at least. So if you are good with using that then I will wrap that back in to your changes as well.

On the HA cluster code, traffic groups are not currently taken into account. If the cluster node is "FAILOVER_STATE_ACTIVE" then all metrics are collected as normal. If the value is anything else then metrics involved VS, TMM, etc are skipped. I do not currently have any F5's clustered that I can play with traffic groups in but could potentially spin some up on my machine.. Would be interesting to see how that would work.

zonywhoop avatar Aug 10 '15 16:08 zonywhoop

I totally agree -- the redundant collection stuff belongs in its own function. I'll take a closer look at what you've done there.

Also, I want to get the new carbon class tested better, merge it, and then start to tackle some of other functionality we've been discussing. I'll start to open issues to address each one of these. There's a nasty bug in the remote timestamp right now that also needs to be fixed.

I plan on diving deeper this week -- will update you soon.

Do you mind kicking the tires on the plaintext branch to test both plaintext and pickle delivery?

mhite avatar Aug 15 '15 19:08 mhite

I merged the 'plaintext' branch into master. Also created new branch called 'timestamp' that should fix the remote timestamp issue.

Once I've got this all tested, can dig further into the items you've brought up!

mhite avatar Aug 18 '15 04:08 mhite

Hey Matt, Been working on another project but I’ll give the plaintext branch a whirl and see how it goes. Hoping I’ll have some time in the next week or so to merge back in the function for stat collection too. Let me know if I can help with anything else.

Ed McLain [email protected]

On Aug 17, 2015, at 11:01 PM, Matt Hite [email protected] wrote:

I merged the 'plaintext' branch into master. Also created new branch called 'timestamp' that should fix the remote timestamp issue.

Once I've got this all tested, can dig further into the items you've brought up!

— Reply to this email directly or view it on GitHub https://github.com/mhite/graphite-collectors/pull/3#issuecomment-132065455.

zonywhoop avatar Aug 18 '15 20:08 zonywhoop