Add Prometheus Metrics to API
This adds a set of Prometheus metrics endpoints to the API so that data can be scraped rather than pushed by Prometheus compatible collectors.
DO NOT DELETE THE UNDERLYING TEXT
Please note
Please read this information carefully. You can run
./lnms dev:checkto check your code before submitting.
- [x] Have you followed our code guidelines?
- [x] If my Pull Request does some changes/fixes/enhancements in the WebUI, I have inserted a screenshot of it.
- [x] If my Pull Request makes discovery/polling/yaml changes, I have added/updated test data.
Testers
If you would like to test this pull request then please run: ./scripts/github-apply <pr_id>, i.e ./scripts/github-apply 5926
After you are done testing, you can remove the changes with ./scripts/github-remove. If there are schema changes, you can ask on discord how to revert.
In general I like it, but I think we might wanna go with something like https://github.com/spatie/laravel-prometheus
@murrant @laf thoughts?
You don't usually see filters in prometheus metrics (or so many end points), looks pretty competent overall.
You are repeating yourself a lot in this code with a lot of boiler plate. You could make this a bit nicer by either using a package as Jellyfrog suggested or adding some more helpers in your trait (which should probably be in a Trait subdirectory)
Cheers, @Jellyfrog and @murrant for the fast feedback. I've attempted to go the traits helper route to reduce boilerplate. The package looks nice but maybe not quite fit-for-purpose here since I'm trying to utilise the LibreNMS API and API security framework. So all I'd really be using would be addGauge/addCounter from it.
Also, on filtering, I kind of agree. My goal is absolutely to scrape everything. I've only used ID filters for testing. I think the main benefit would be device group. For people with very large deployments, I could see wanting to create a "prometheus" device group and just scraping that.
That said, I'm able to scrape the /ports endpoint with ~50000 ports in ~13sec and the /sensors endpoint with ~110000 sensors in ~14sec. So I've been pretty pleased with the performance so far in testing.
I'm mostly fine with this (just the filtering feels odd).
Time to write some documentation.
Is this intended to access all metrics and metrics values instead of just metadata about the LibreNMS application?
If you want to dump all metrics to Prometheus, you're going to need something different.
I'm currently using it for both. With much better results exporting metrics via this than with the push-gateway method.
For all metrics, the architecture should be something like store the current value in redis during polling. Then dump the values from redis to Prometheus during scrape. Using the database like you are won't reach scale.