libs
libs copied to clipboard
[WIP] cleanup(driver,userspace/libscap,userspace/libsinsp): dropped simpleconsumer mode and simple driver concepts
What type of PR is this?
/kind cleanup
Any specific area of the project related to this PR?
/area driver-kmod
/area driver-bpf
/area libscap-engine-bpf
/area libscap-engine-kmod
Does this PR require a change in the driver versions?
/version driver-API-version-major
/version driver-API-version-minor
/version driver-API-version-patch
/version driver-SCHEMA-version-major
/version driver-SCHEMA-version-minor
/version driver-SCHEMA-version-patch
We'd need to bump the API maj version, but it was already bumped during this release cycle.
What this PR does / why we need it:
Dropped simpleconsumer mode and simple driver concepts.
Instead, clients should always push interesting syscalls set using new sinsp APIs:
void set_syscalls_of_interest(std::unordered_set<uint32_t> &syscalls_of_interest);
void mark_syscall_of_interest(uint32_t ppm_sc, bool enabled = true);
Moreover, moved bpf interesting syscall check as soon as possible. A new map was created to hold interesting syscalls for bpf.
Dropped EF_DROP_SIMPLE_CONS and UF_SIMPLEDRIVER_KEEP flags too.
Moreover, added a new interesting_tp_set
field to scap_open_args, to allow scap clients to specify a set of tracepoints to be attached.
When NULL, all tracepoints will be attached (default value). Updated interesting_ppm_sc_codeset to match same behavior when NULL. Libbscap scap-open example now has 2 more options:
--tpand
--ppm_sc`, that allow to:
- mark a single (or multiple, if option is passed multiple times) tracepoints as interesting
- mark a single (or multiple, if option is passed multiple times) syscalls as interesting
Note however that the new interesting_tp_set
is actually unused by sinsp, ie: it is only available to scap clients, not sinsp ones.
IMHO this is a somewhat debug feature that should not be exposed to libs clients.
Example:
sudo ./libscap/examples/01-open/scap-open --bpf driver/bpf/probe.o --tp sys_exit --ppm_sc 27 --ppm_sc 28
---------------------- SCAP SOURCE ----------------------
* BPF probe: 'driver/bpf/probe.o'
-----------------------------------------------------------
---------------------- CONFIGURATIONS ----------------------
* Print single event type: -1 (`-1` means no event to print).
* Run until '18446744073709551615' events are catched.
--------------------------------------------------------------
* OK! BPF probe correctly loaded: NO VERIFIER ISSUES :)
* Live capture in progress...
* Press CTRL+C to stop the capture
^C
---------------------- STATS -----------------------
events captured: 2
seen by driver: 2
Where ppm_sc 27 and 28 are mkdir
and rmdir
, and i've got another terminal that runs mkdir x; rmdir x
.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
NONE
/cc @gnosek
❤️
Could we include a patch to scap-open
?
Add summary of syscall events (or any event) in form of a set() showing distinct syscall ids seen (sanity check for perf studies). Also transform to human readable list of syscall names if possible.
Edited: Perhaps also add start and end timestamp, duration of run and average event rate per second to scap-open
summary stats? @FedeDP
@Andreagit97 i refactored all the kmod main to always pass through the force_tp_set
function when registering/unregistering tracepoints. The code is much more compact and clean now :)
We still miss:
- [ ] dropping UF_ALWAYS_DROP logic -> TODO: potentially leave it to a subsequent PR; see https://github.com/falcosecurity/libs/pull/521#issuecomment-1205032197
- [x] ~~possibly dropping
PPM_IOCTL_ENABLE/DISABLE_SIGNAL_DELIVER
, and possibly evenPPM_IOCTL_ENABLE_PAGE_FAULTS
@gnosek (our historical memory :D ) perhaps you even know why we do not automatically attach page faults tracepoints inppm_open
, and instead always wait for the proper ioctl~~ :/ This was achieved in 5th commit - [x] understand what to do with old
simple_consumer_consider_evt()
functions: unfortunately,EF_DROP_SIMPLE_CONS
flags were also used to allow userspace filtering of non-syscall driven events (or, well, events coming from other tracepoints, likePPME_SIGNALDELIVER_E
). A possible solution is to actually allow clients to disable tracepoints (exposing the proper api from sinsp), and then allow eg: Falco to disable syscalls it he not interested in, + disabling tracepoints he is not interesting into. -> this was achieved in 5th commit (see https://github.com/falcosecurity/libs/pull/521#issuecomment-1204992916) This would also give a nice performance boost and lower drops while it runs in old simple consumer mode, because we would completely avoid attaching some high volumes tracepoints. @Andreagit97
@incertum this PR gives scap-open the ability to entirely disable tracepoints and mark syscalls as uninteresting while starting, like doing:
sudo ./libscap/examples/01-open/scap-open --bpf driver/bpf/probe.o --tp sys_exit --ppm_sc 27 --ppm_sc 28
would mean: "only attach sys_exit
and only monitor mkdir
and rmdir
".
Latest commit exposes new scap_open_args.tp_of_interest
in sinsp; we will now expect consumers of sinsp to properly set desired syscalls_of_itnerest and tp_of_interest values.
Note that Falco did never attach PAGE_FAULT tracepoints, therefore it cannot leverage default "NULL" scap_open_args.tp_of_interest
value that will instead enable all supported ones, following same logic as syscalls_of_interest
.
Moreover, IOCTLs to runtime enable signal deliver and page faults tracepoints were dropped because given the new tp_of_interest
feature, they were useless.
It would also be great to delete the secondary drop logic in drivers, ie: the one that gets enabled in ex simple consumer mode, when driver's dropping_mode
is enabled, and we drop all UF_ALWAYS_DROP
events (NOTE: they do not perfectly overlap with old EF_DROP_SIMPLE_CONS events :/ ) and events without UF_NEVER_DROP flag when we exceed the setting's sampling ratio, like for example in our probe:
if (drop_flags & UF_NEVER_DROP)
return false;
if (drop_flags & UF_ALWAYS_DROP)
return true;
if (state->tail_ctx.ts % 1000000000 >= 1000000000 /
settings->sampling_ratio) {
if (!settings->is_dropping) {
settings->is_dropping = true;
state->tail_ctx.evt_type = PPME_DROP_E;
return false;
}
return true;
}
We also send a PPME_DROP_E/X
event when we are entering/leaving dropping mode because of sampling ration.
I think that:
- we could just send the
PPME_DROP_E
event when driver is under pressure, and let userspace update syscalls_of_interest (that we still support updating at runtime) to eventually remove some more syscalls_of_interest. - As soon as
PPME_DROP_X
is received, userspace can then restore initial syscalls_of_interest.
This behavior is much more simple and allows for fine-grained control by userspace, without drivers internal weird inscrutable logic that userspace knows nothing about and can't do anything about.
We could also entirely drop PPME_DROP_E/X
events, and instead rely on consumers calling scap_get_stats()
(like Falco does) and managing "driver under pressure" situation themselves with a custom logic, eg: using sinsp::set_eventmask()
.
@incertum
Probably missing a lot of context and discussions. What is the desired end user interface to be able to control tp_of_interest and ppm_sc_of_interest? Would be very interested in having easy to manage overwrite control in order to optimize the tool for various environments and use cases as I need to be very careful about performance overhead.
You've got 4 APIs sinsp side:
void set_syscalls_of_interest(std::unordered_set<uint32_t> &syscalls_of_interest);
void set_tracepoints_of_interest(std::unordered_set<std::string> &tp_of_interest);
void mark_syscall_of_interest(uint32_t ppm_sc, bool enabled = true);
void mark_tracepoint_of_interest(string &tp, bool enabled = true);
that must be called before sinsp::open()
is called.
We are still discussing about the runtime change API (we already have sinsp::set_eventmask
and sinsp::unset_eventmask
, but perhaps we can do a little better there).
EDIT: commit https://github.com/falcosecurity/libs/pull/521/commits/cc23e7dab7d374dc008cefb349d64b2c9981e056 makes some fixes and allows the aforementioned sinsp methods to be called at runtime too, ie when scap is already opened, and will automatically handle runtime syscalls of interest. Note that runtime tracepoints of interest are still to be implemented (perhaps for Falco 0.34).
Rebased on top of master.
@leogr @gnosek this PR is now ready to be reviewed :)
@leogr @gnosek this PR is now ready to be reviewed :)
Hey @FedeDP
This PR is awesome :star_struck: After a quick look, it SGTM, a significant improvement!
If I understood correctly, as discussed privately, to preserve the old sinsp behavior, one should do something like:
m_inspector->set_syscalls_of_interest(enforce_sinsp_syscalls_of_interest(falco_optimized_syscall_set));
Do you confirm? It would be helpful to advertise that to libs consumers, I guess.
Exactly that @leogr ! I tried my best with the new sinsp api doc :)
Rebased on top of current master after the #540 merge. Thanks to @Andreagit97 for holding my hand during the rebase :sob:
@FedeDP Awesome job! Looking forward to seeing this PR merged!
LGTM label has been added.
@FedeDP @Andreagit97 I tested a build of the SysFlow collector with this PR, and it seems like it introduced a regression issue when using the BPF driver. Essentially, what I observed is that the BPF driver loads successfully but we don't see any events coming from sinsp. When using the kmod driver, things work normally. I wonder if the event table is being correctly populated and passed to the driver in the BPF case?
@araujof
sudo ./libscap/examples/01-open/scap-open --bpf driver/bpf/probe.o --tp 0 --ppm_sc 27 --ppm_sc 28
[sudo] password di federico:
---------------------- SCAP SOURCE ----------------------
* BPF probe: 'driver/bpf/probe.o'
-----------------------------------------------------------
--------------------- CONFIGURATIONS ----------------------
* Print single event type: -1 (`-1` means no event to print).
* Run until '18446744073709551615' events are catched.
-----------------------------------------------------------
---------------------- INTERESTING SYSCALLS ----------------------
* Syscalls enabled:
- mkdir
- rmdir
------------------------------------------------------------------
---------------------- ENABLED TRACEPOINTS ----------------------
* Tracepoints enabled:
- sys_enter
-----------------------------------------------------------------
* OK! BPF probe correctly loaded: NO VERIFIER ISSUES :)
* Live capture in progress...
* Press CTRL+C to stop the capture
^C
---------------------- STATS -----------------------
Events captured: 2
Seen by driver: 2
Time elapsed: 5 s
Working here: i opened the bpf probe with just mkdir and rmdir as interesting syscalls, and sys_exit tracepoint; in another terminal i ran mkdir p && rmdir p -> only 2 events.
/cc @leogr @Molter73 @gnosek
I tested a build of the SysFlow collector with this PR, and it seems like it introduced a regression issue when using the BPF driver. Essentially, what I observed is that the BPF driver loads successfully but we don't see any events coming from sinsp. When using the kmod driver, things work normally. I wonder if the event table is being correctly populated and passed to the driver in the BPF case?
If i understood correctly, @Molter73 experienced something like this even on master, right? I think it does not depend upon this PR but we will certainly try to understand if we got a bug and eventually fix it!
I tested a build of the SysFlow collector with this PR, and it seems like it introduced a regression issue when using the BPF driver. Essentially, what I observed is that the BPF driver loads successfully but we don't see any events coming from sinsp. When using the kmod driver, things work normally. I wonder if the event table is being correctly populated and passed to the driver in the BPF case?
If i understood correctly, @Molter73 experienced something like this even on master, right? I think it does not depend upon this PR but we will certainly try to understand if we got a bug and eventually fix it!
I did experience something similar yesterday on various commits from the master branch, in my case in particular I was not getting any execve
events with the eBPF probe, but was getting some other syscalls. We looked into it with @Andreagit97 and seems to be some bug that happens sporadically, which makes it harder to debug.
@Molter73 thanks to @Andreagit97 too! :) I will commit multiple suggestions by you then squash them into a single commit!
@Molter73 i applied all the useful suggestions! Thank you very much!
It's always a pleasure to work with you @FedeDP :rocket:
@Molter73 we don't want to break again your tests with other changes in the sinsp-example
:( Are these changes ok for you or do we need to change something? or maybe we can leave the example as it is and modify it in another PR, let us know how we can help you :)
Yep, we also have some conflicts with master. Tomorrow i will rebase!
@Andreagit97, I can fix the tests as many times as needed, don't worry about it. I don't think it's worth it to slow down development for a set of tests that are not yet official for the repo.
Rebased on top of master.
LGTM label has been added.
I cannot approve it, because i worked on it, but it seems ready IMHO
Pulled latest changes and simulated https://github.com/falcosecurity/libs/pull/521#issuecomment-1227099508 in the sinsp-example
and everything seems to work as expected.
Simulation was using the new APIs to enforce syscalls of interest, but adding fchmodat
which can be important to subscribe to through Falco rules.
std::unordered_set<uint32_t> ppm_sc = inspector.enforce_sinsp_state_ppm_sc();
open_engine(inspector);
inspector.mark_ppm_sc_of_interest(207); // fchmodat