systemd_exporter WIP: Add more testing + memory stats from cgroups

PR adds:

Reports unit memory metrics by reading memory.stat from cgroupfs
Pulls cgroup code into distinct package to prep for upstreaming
Improves TravisCI setup - example
- Adds end-to-end integration test for two live systemd versions
- Adds unit tests
Increases scrape speed by about 15% by reducing dbus calls
Uploads code coverage - example

Remaining TODOs

Decide what memory stats we should export. There are a lot. @povilasv any thoughts you have would be appreciated here. It seems silly to start exporting 20 new metrics per unit so a decision should be made w.r.t. what memory metrics are most impactful
Cleanup docs
Cleanup cgroup/cgroup.go file

fixes #2

Feb 03 '20 06:02 hamiltont

Hey :wave: let me know when it's ready for review :)

Feb 06 '20 07:02 povilasv

@povilasv Sure, anytime now should be OK. This PR is big already (sorry, seems to be my bad habit), so let's aim to clean it up as needed and get it merged before the next push ;-) I could use any insight you have on "what memory values matter enough to export as metrics"

Feb 06 '20 07:02 hamiltont

Also, a second concern - what should the exported metrics be named? node_exporter moved to using the pattern node_memory_Capname_metric (see screenshot from one of my prometheus consoles). I could not find a good discussion on why they changed their naming pattern, or why they export so many metric names instead of just having something like node_memory{stat="anon"}. Regardless, this is a good time to consider how systemd_exporter wants timeseries for memory metrics to be organized

Feb 06 '20 07:02 hamiltont

Will review this week thanks for this :)

Re naming I am not sure why node exporter does this but I think we should follow https://prometheus.io/docs/practices/naming/ and do whatever makes sense for our use case and query patterns.

I feel like doing stat=anon will make one metric be super big which will suffer from huge cardinality explosion so maybe a metric per stat

@SuperQ Would be great to get some insights from you regarding naming :)

Feb 10 '20 14:02 povilasv

Labels only make sense when there is an aggregation that can be applied. Using a label for node_memory_... doesn't make sense because the various values don't add up or average together. For example: sum(node_memory_bytes) doesn't make any sense since MemTotal + MemAvailable bytes is nonsensical.

Having separate metrics also helps for more typical use cases like this:

node_memory_MemAvailable_bytes /
node_memory_MemTotal_bytes

If it were a label, you would have to do something crazy like this:

sum without (stat) (node_memory_bytes{stat="MemAvailable"}) /
sum without (stat) (node_memory_bytes{stat="MemTotal"})

Feb 10 '20 15:02 SuperQ

That's very helpful, thanks. We should export distinct metrics then. @povilasv to answer the second question (what should we export) maybe we just expose the same metrics that node_exporter does, but do it per unit being monitored. It would be more than enough to get anyone started. Thoughts?

Side point - we may want to document that users may want to run more than one systemd_exporter with different units monitored at different levels to deal with cardinality issues. I've got an idea on this for future, thinking we could allow folks to filter by the slice or scope units are in

Feb 10 '20 16:02 hamiltont

My opinion on metrics selection. I do not see harm on essentially exporting all of them. If they are constant, they will be well compressed by Prometheus and consume almost no resources.

The more interesting question would be which metrics to keep as a separate metric/gauge, and which ones to put together with separate labels. I think it is useful to have some metrics grouped into single gauge with separate labels, if it does make sense to 1) visualise all of them at the same time, 2) make sum of them to see for a total of something.

From a quick look it looks like this only applies to few metrics.

rss and rss_huge could be put together. Unless rss from cgroups already accounts for huge allocations.
faults. it might make sense to use single counter with minor and major label values. But I do expect major to be what most people look at, with minor providing almost no information, and being significantly bigger often, leading to hiding of actual information. So that might still be beneficial to keep them separate.

As of the others metrics, I think they should be all separate (because doing aggregation over them doesn't make much sense), but I didn't look too deeply into this, and there might be some exceptions.

Feb 10 '20 19:02 baryluk

Thanks @baryluk

One thing to point out, is if we have metrics that have parts where they can be summed, and we also have a "total" exposed as another part, we try and avoid exposing those in Prometheus best practices. This allows cleaner use of sum(metric).

Feb 10 '20 20:02 SuperQ

Thanks @baryluk

One thing to point out, is if we have metrics that have parts where they can be summed, and we also have a "total" exposed as another part, we try and avoid exposing those in Prometheus best practices. This allows cleaner use of sum(metric).

Sounds good!

Following Prometheus best practices is a very good idea, to reduce amount of user confusion, but in general it is actually very logical, short and well written to avoid common mistakes with naming metrics and labels.

However regarding your comment, I can't find a general guideline in Prometheues best practices about not exporting total metric if sum(other_metric) provides same information (modulo minor atomicity differences and moving the aggregation cost from one process to another, and adding a bit extra of memory / storage).

I checked https://prometheus.io/docs/practices/naming/

Feb 10 '20 20:02 baryluk

@baryluk It's under the labels section of "writing exporters".

https://prometheus.io/docs/instrumenting/writing_exporters/#labels

Feb 14 '20 14:02 SuperQ

Retitled as WIP while I work through the feedback

Feb 24 '20 18:02 hamiltont

@hamiltont thanks for all the amazing work you've done here. do you have plans to resume work on this PR? those cgroup rss metrics are very appealing :)

Apr 29 '20 15:04 hectorhuertas

Thanks! Yes, I actually have some nice commits to push, been busy with my day job. I'll try to get out an update this week

On Wed, Apr 29, 2020, 11:25 AM Hector Huertas [email protected] wrote:

@hamiltont https://github.com/hamiltont thanks for all the amazing work you've done here. do you have plans to resume work on this PR? those cgroup rss metrics are very appealing :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/povilasv/systemd_exporter/pull/10#issuecomment-621284237, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKRZFBAUJ42BRO5BJ2S4DRPBBGZANCNFSM4KO7X5JA .

Apr 29 '20 16:04 hamiltont

What's the status here? I've been trying this out for a while and it really makes the metrics much more useful. I had to add a small patch to ignore services with the RemainAfterExit property (https://github.com/talyz/systemd_exporter/commit/c4f6128998ad2e36a980200b6f75e76b0e11bde5) in order for it to not spam my logs, but that's the only issue I've had with it so far.

Sep 13 '21 21:09 talyz

Well unfortunately real life jumped up and took hold, but I am thrilled to hear that some folks found it useful

At this point.... If anyone has time to wrap this work up (as I recall that's mainly responding to the feedback that was given) it would be super appreciated to have the help

I'll see if I can push all the other improvements that I had to a distinct branch in case anyone is interested. As I recall they were definitely not ready (I was/am learning golang in the evenings, so while I can see the concepts needed to make a system better it takes me quite a while to translate those into useful go code)

On Mon, Sep 13, 2021, 5:18 PM Kim Lindberger @.***> wrote:

What's the status here? I've been trying this out for a while and it really makes the metrics much more useful. I had to add a small patch to ignore services with the RemainAfterExit property @.*** https://github.com/talyz/systemd_exporter/commit/284e372b8dbe12b29031ac0d19c551167cc75fff) in order for it to not spam my logs, but that's the only issue I've had with it so far.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/povilasv/systemd_exporter/pull/10#issuecomment-918585111, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACKRZBGEXT4IBTHVN4FRVLUBZTBVANCNFSM4KO7X5JA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Sep 13 '21 22:09 hamiltont

I see :) Yes, it's really useful - great work!

Unfortunately, I don't think I can be of much help - I only know golang at a very basic level. I hope someone can, though.

Sep 14 '21 15:09 talyz

i filed bug #46 about how memory usage is reported incorrectly by this exporter. would this PR fix that problem?

May 10 '22 15:05 anarcat

After a very long time, we've moved this to prometheus-community. If you're still interested in working on this, please rebase this change against the latest commits.

Jul 19 '22 14:07 SuperQ

I tried to "rebase this change against the latest commits." but there were a lot of merging problems, but the result as it is (squashed, because there were simply way too many conflicts in the commit-by-commit rebase) is in #66 .

Nov 02 '22 17:11 oseiberts11

systemd_exporter systemd_exporter copied to clipboard

WIP: Add more testing + memory stats from cgroups

systemd_exporter
systemd_exporter copied to clipboard