fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

in_podman_metrics: Initial plugin implementation

Open pabloxxl opened this issue 3 years ago • 5 comments

This patch introduces new plugin called in_podman_metrics. It reads list of containers from podman internal storage, and for each container, reads information from /sys and /proc containers (so no podman command is ever executed). After that, it exposes this data in cmetric format, so exporters like prometheus_exporter can grab it.

This patch introduces initial version of in_podman_metrics plugin - for now it exposes 10 types of metrics: 4 related to container memory usage, 2 related to container cpu usage and 4 related to container network usage.

Signed-off-by: Paweł Cendrzak [email protected]

This PR adds initial implementation of in_podman_metrics input plugin. This plugin grabs podman container list from podman configuration json file, and for each container grabs information about cpu, memory and network usage from /sys and /proc filesystems. It supports both versions v1 and v2 of cgroups, and can detect containers in nested directories for both versions. This plugin does not use any podman commands or API calls.

Each scrape consists of four stages:

  1. Read config json file (defaulting /var/lib/containers/storage/overlay-containers/containers.json). Grab name and if for each entry
  2. For each container, localize /sys/fs/cgroups directory (directories) that contain its data. Save memory and cpu information. Also check first PID connected with it.
  3. For each container, get network usage from /proc//dev/net by using PID from step 2
  4. For each container, create counter (or gauge) and set its value

The code included in this Pull Request was developed during implementation of Motorola Solutions metrics feature. It mimics part of functionalities delivered by cadvisor (but for podman containers, which are not supported by it). Because of it, metric names mimic cadvisor naming convention.

Valgrind execution log valgrind.txt Configuration file conf.txt Log with debug log level log.txt Documentation PR


Enter [N/A] in the box, if an item is not applicable to your change.

Testing Before we can approve your change; please submit the following in a comment:

  • [x] Example configuration file for the change
  • [x] Debug log output from testing the change
  • [x] Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Attached local packaging test output showing all targets (including any new ones) build.

Documentation

  • [x] Documentation required for this feature

Documentation PR

Backporting

  • [N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

pabloxxl avatar Jul 16 '22 15:07 pabloxxl

@pabloxxl can you rebase, we were having issues with the MacOS unit tests that are hopefully resolved now? Yours timeout here.

patrick-stephens avatar Aug 17 '22 10:08 patrick-stephens

@pabloxxl can you rebase, we were having issues with the MacOS unit tests that are hopefully resolved now? Yours timeout here.

rebased

pabloxxl avatar Aug 22 '22 09:08 pabloxxl

Whoops, Sorry, I thought that "comment and close" meant closing particular comment, not entire pull request :(

pabloxxl avatar Aug 22 '22 09:08 pabloxxl

I've noticed that all builds in workflows failed because cmt_time_now() is now cfl_time_now(). I hope it will work now.

pabloxxl avatar Sep 15 '22 12:09 pabloxxl

Also, in of TCs on ubuntu machine I've noticed segmentation fault while assigning numer of read bytes... It seems that this variable should be size_t (d'oh) and not int - so I've changed it.

pabloxxl avatar Sep 15 '22 14:09 pabloxxl

@edsiper is anything more required for this pull request? It is hanging for some time now.

pabloxxl avatar Nov 07 '22 10:11 pabloxxl

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions[bot] avatar Feb 06 '23 02:02 github-actions[bot]

@edsiper Hey again, just reminding about this PR

pabloxxl avatar Feb 06 '23 10:02 pabloxxl

@pabloxxl I've triggered the packaging tests just to confirm it builds for all targets.

patrick-stephens avatar Feb 06 '23 11:02 patrick-stephens

It seems that a lot of build targets failed :( I have backmerged to master and removed .event_type (since it looks like it is no longer used for input plugins)

pabloxxl avatar Feb 07 '23 09:02 pabloxxl

@pabloxxl could you update your commits to be signed off and use the in_podman_metrics: prefix? I can approve the CI runs but it'll just fail because of that so be good to get it in first.

patrick-stephens avatar Feb 21 '23 09:02 patrick-stephens

@pabloxxl could you update your commits to be signed off and use the in_podman_metrics: prefix? I can approve the CI runs but it'll just fail because of that so be good to get it in first.

I've changed commit from the top to use prefix and be signed - or should I change every one of my commits? I can also squash them I guess.

pabloxxl avatar Feb 21 '23 09:02 pabloxxl

@pabloxxl could you update your commits to be signed off and use the in_podman_metrics: prefix? I can approve the CI runs but it'll just fail because of that so be good to get it in first.

I've changed commit from the top to use prefix and be signed - or should I change every one of my commits? I can also squash them I guess.

I've squashed all commits into a original one

pabloxxl avatar Mar 03 '23 10:03 pabloxxl

@edsiper @patrick-stephens it seems that all checks are passed. Is there anything more to be done here?

pabloxxl avatar Mar 23 '23 10:03 pabloxxl

@edsiper @patrick-stephens it seems that all checks are passed. Is there anything more to be done here?

Just needs @edsiper to confirm he is happy with the changes since review I think.

patrick-stephens avatar Mar 23 '23 11:03 patrick-stephens

@pabloxxl are you willing to take ownership of maintaining this? We need to specify codeowners as well.

patrick-stephens avatar Mar 23 '23 11:03 patrick-stephens

@pabloxxl are you willing to take ownership of maintaining this? We need to specify codeowners as well.

Sure! I will be glad if could be more involved.

pabloxxl avatar Mar 23 '23 11:03 pabloxxl

@edsiper @patrick-stephens it seems that all checks are passed. Is there anything more to be done here?

Just needs @edsiper to confirm he is happy with the changes since review I think.

Hey @edsiper, can you take a look?

pabloxxl avatar Apr 04 '23 08:04 pabloxxl