libs
libs copied to clipboard
fix(libsinsp): enable metrics collector on all platforms
What type of PR is this?
Uncomment one (or more)
/kind <>lines:
/kind bug
/kind cleanup
/kind design
/kind documentation
/kind failing-test
/kind feature
Any specific area of the project related to this PR?
Uncomment one (or more)
/area <>lines:
/area API-version
/area build
/area CI
/area driver-kmod
/area driver-bpf
/area driver-modern-bpf
/area libscap-engine-bpf
/area libscap-engine-gvisor
/area libscap-engine-kmod
/area libscap-engine-modern-bpf
/area libscap-engine-nodriver
/area libscap-engine-noop
/area libscap-engine-source-plugin
/area libscap-engine-savefile
/area libscap
/area libpman
/area libsinsp
/area tests
/area proposals
Does this PR require a change in the driver versions?
/version driver-API-version-major
/version driver-API-version-minor
/version driver-API-version-patch
/version driver-SCHEMA-version-major
/version driver-SCHEMA-version-minor
/version driver-SCHEMA-version-patch
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
fix(libsinsp): enable metrics collector on all platforms
Since we don't need this for the next release, i'd put this in the /milestone 0.18.0
I think we might want to move these in the scap_platform vtable, likely as a struct scap_metrics_vtable (embedded in each scap_foo_platform), so that we could get platform-dependent metrics from the scap handle. Again, this might be an idea for a future refactor.
Ei @FedeDP make sense! Since you moved this to the next milestone and we are not in hurry, I can take care of this :)
I think we might want to move these in the scap_platform vtable, likely as a struct scap_metrics_vtable (embedded in each scap_foo_platform), so that we could get platform-dependent metrics from the scap handle. Again, this might be an idea for a future refactor.
Ei @FedeDP make sense! Since you moved this to the next milestone and we are not in hurry, I can take care of this :)
Added this as item to https://github.com/falcosecurity/falco/issues/3194#issuecomment-2111009270. Just to reiterate: If we could fix the agent info initialization for Linux for the plugin platform (see https://github.com/falcosecurity/falco/issues/2821) -- it would be fantastic. For macOS and Windows CPU utilization and memory usage calculation would need to be new code, not sure if truly needed, WDYT?
If we could fix the agent info initialization for Linux for the plugin platform (see https://github.com/falcosecurity/falco/issues/2821) -- it would be fantastic.
Agree!
For macOS and Windows CPU utilization and memory usage calculation would need to be new code, not sure if truly needed, WDYT?
I think it is interesting to expose those metric for osx and win too, but yes it's not high priority.
@mrgian hope all is well, just wanted to kindly check in and ask what our current plan is to get out of the regression in our scap platforms approach? (https://github.com/falcosecurity/falco/issues/2821) If we can have a proper refactor -- amazing. Else I would also support something more intermediate to ensure the next Falco release does not have this regression anymore. CC @FedeDP @leogr
Thanks in advance!
Hey @incertum
For now I'm just moving linux-specific metrics collection logic to the scap_platform vtable. So that we can use the scap handle to gather platform-dependent metrics. This will make libs_metrics_collector platform agnostic.
I'm not working on a proper refactor that will solve the regression, but if you have any idea for that please let me know!
Hey @incertum For now I'm just moving linux-specific metrics collection logic to the
scap_platformvtable. So that we can use the scap handle to gather platform-dependent metrics. This will makelibs_metrics_collectorplatform agnostic. I'm not working on a proper refactor that will solve the regression, but if you have any idea for that please let me know!
Posted here https://github.com/falcosecurity/falco/issues/2821#issuecomment-2238356760
Seems like this one and #2821 are intertwined :)
@mrgian, please take a look at my comment https://github.com/falcosecurity/libs/pull/1969#discussion_r1687589774 for some ideas about the future direction of libscap/libsinsp and scap_platform. IMO, let's move stuff out of libscap, not into (especially here: libscap doesn't care one bit about these metrics, they're purely for libsinsp use).
If you agree with that, then I think it's not a good idea to add more stuff to scap_platform. Instead, we can make metrics_collector a virtual base class (this is effectively what a scap_platform is) and move the concrete implementation to e.g. userspace/libsinsp/linux/metrics_collector.cpp.
Then, we have two options for the consumers of the metrics:
- provide a no-op userspace/libsinsp/generic/metrics_collector.cpp for other platforms so that we always have some metrics collector, or
- add a
#define(presumably via cmake) that says we do have a metrics_collector for this platform and#ifdefon that, rather than__linux__
(I'd rather go for 1, personally)
One thing to bikeshed would be the directory structure (it's trivial here, but it will set precedent for future per-platform components). I see two approaches:
libsinsp/linux/metrics_collector.cpp:
- (good) we can build e.g. sinsp_linux.a from the whole libsinsp/linux directory, simplifying the build system a little
- (bad) the API header would have to live directly in libsinsp/
libsinsp/metrics_collector/linux_metrics_collector.cpp:
- (good) provides a nice place for a platform-agnostic header with the base class definition
- (bad) platform-specific code is spread across directories, making it a bit less convenient to create common per-platform helpers (would have to live in something like
libsinsp/linux/common.h)
I don't have a strong opinion on this either way tbh.
Ehi @gnosek I see now. I agree on keeping the metrics collection logic out of the scap_platform. Also the scap_platform it's plain-C, this can make collecting other kinds of metrics harder.
Instead, we can make metrics_collector a virtual base class
If I'm not wrong, currently libs_resource_utilization (https://github.com/falcosecurity/libs/blob/master/userspace/libsinsp/metrics_collector.h#L271-L300) is the only class with linux-only code.
A similar solution would be making libs_resource_utilization a virtual class (with platform-specific implementations).
As you said, taking a decision on the directory naming will influence future components development, so I'll wait to know what the maintainers think.
Perf diff from master - unit tests
10.18% -0.88% [.] sinsp::next
6.71% +0.52% [.] sinsp_evt::get_type
2.80% -0.43% [.] is_conversion_needed
3.35% -0.40% [.] sinsp_thread_manager::get_thread_ref
9.67% -0.38% [.] sinsp_parser::reset
1.11% +0.38% [.] sinsp_evt::get_ts
5.83% -0.32% [.] next_event_from_file
1.16% +0.31% [.] sinsp_parser::event_cleanup
3.42% -0.29% [.] sinsp_thread_manager::find_thread
0.74% +0.27% [.] libsinsp::events::is_unknown_event
Heap diff from master - unit tests
peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
Heap diff from master - scap file
peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
Benchmarks diff from master
Comparing gbench_data.json to /root/actions-runner/_work/libs/libs/build/gbench_data.json
Benchmark Time CPU Time Old Time New CPU Old CPU New
----------------------------------------------------------------------------------------------------------------------------------------------
BM_sinsp_split_mean -0.0534 -0.0535 151 143 151 143
BM_sinsp_split_median -0.0558 -0.0559 150 142 150 142
BM_sinsp_split_stddev +0.3083 +0.3113 2 2 2 2
BM_sinsp_split_cv +0.3821 +0.3854 0 0 0 0
BM_sinsp_concatenate_paths_relative_path_mean -0.0599 -0.0600 61 57 61 57
BM_sinsp_concatenate_paths_relative_path_median -0.0597 -0.0598 61 57 61 57
BM_sinsp_concatenate_paths_relative_path_stddev -0.1440 -0.1455 0 0 0 0
BM_sinsp_concatenate_paths_relative_path_cv -0.0894 -0.0910 0 0 0 0
BM_sinsp_concatenate_paths_empty_path_mean +0.0481 +0.0480 24 25 24 25
BM_sinsp_concatenate_paths_empty_path_median +0.0476 +0.0475 24 25 24 25
BM_sinsp_concatenate_paths_empty_path_stddev +0.4570 +0.4547 0 0 0 0
BM_sinsp_concatenate_paths_empty_path_cv +0.3902 +0.3881 0 0 0 0
BM_sinsp_concatenate_paths_absolute_path_mean -0.1516 -0.1516 67 56 67 56
BM_sinsp_concatenate_paths_absolute_path_median -0.1654 -0.1654 67 56 67 56
BM_sinsp_concatenate_paths_absolute_path_stddev -0.0706 -0.0704 1 1 1 1
BM_sinsp_concatenate_paths_absolute_path_cv +0.0954 +0.0957 0 0 0 0
BM_sinsp_split_container_image_mean +0.0178 +0.0177 385 392 385 392
BM_sinsp_split_container_image_median +0.0178 +0.0177 385 392 385 392
BM_sinsp_split_container_image_stddev -0.3534 -0.3539 2 2 2 2
BM_sinsp_split_container_image_cv -0.3647 -0.3651 0 0 0 0
Codecov Report
Attention: Patch coverage is 93.75000% with 7 lines in your changes missing coverage. Please review.
Project coverage is 75.19%. Comparing base (
230ddfb) to head (ea6ddee). Report is 5 commits behind head on master.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| userspace/libsinsp/linux/resource_utilization.cpp | 93.33% | 7 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #1870 +/- ##
==========================================
- Coverage 75.19% 75.19% -0.01%
==========================================
Files 259 261 +2
Lines 33875 33875
Branches 5800 5801 +1
==========================================
- Hits 25473 25472 -1
- Misses 8402 8403 +1
| Flag | Coverage Δ | |
|---|---|---|
| libsinsp | 75.19% <93.75%> (-0.01%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
If I'm not wrong, currently
libs_resource_utilization(https://github.com/falcosecurity/libs/blob/master/userspace/libsinsp/metrics_collector.h#L271-L300) is the only class with linux-only code.
Confirmed.
As you said, taking a decision on the directory naming will influence future components development, so I'll wait to know what the maintainers think.
Also don't have any preference. Maybe go with what @gnosek deems slightly better, because Grzeg has been around the block for some time and I get all the callouts. The ifdefs were a good solution to get these metrics going. Now we can finally get it right. By now 4+ folks already refactored the libs metrics collector, so there is hope that we will stabilize that code at some point 🙃 .
Any news on this @mrgian ?
Ei @FedeDP Not yet! We decided to refactor this again :( and currently I'm busy with other tasks So I don't think this will make it in the next release, but I will start working on this as soon as I can
Ok! Moving to next milestone then :) /milestone 0.19.0
Ehi @incertum @gnosek
I moved all the linux-specific code in linux/resource_utilization.cpp.
If compiled on a non-linux platform, instead of using linux_resource_utilization we use a generic libs_metrics which returns an empty metrics vector on to_metrics().
This allows us to use the metrics collector on all platforms.
WDYT?
/cc @gnosek
/milestone 0.20.0
LGTM label has been added.
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: FedeDP, mrgian
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [FedeDP]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment