libs set sampling ratio of dropping for single syscall individually

What type of PR is this?

/kind feature

Any specific area of the project related to this PR?

/area driver-kmod

/area driver-bpf

/area driver-modern-bpf

/area libscap-engine-bpf

/area libscap-engine-kmod

/area libscap-engine-modern-bpf

/area libscap

/area libpman

Does this PR require a change in the driver versions?

/version driver-API-version-major

/version driver-API-version-minor

What this PR does / why we need it: Set sampling ratio of dropping for single syscall individually，users can set different discard rates for system calls of different importance.

Aug 29 '23 07:08 wangyongfeng5

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Aug 29 '23 07:08 poiana

Welcome @wangyongfeng5! It looks like this is your first PR to falcosecurity/libs 🎉

Aug 29 '23 07:08 poiana

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: wangyongfeng5 Once this PR has been reviewed and has the lgtm label, please assign molter73 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

Aug 29 '23 07:08 poiana

Please double check driver/SCHEMA_VERSION file. See versioning.

/hold

Aug 29 '23 07:08 github-actions[bot]

Hi! Thanks for your contribution! This is a rather interesting idea; i think you need to bump driver/API_VERSION major (and SCAP_MINIMUM_API_VERSION). Also, it would be great to have tests around the new feature; we already cover the sampling ratio feature: https://github.com/falcosecurity/libs/tree/master/test/drivers/test_suites/actions_suite; can you expand/fixup them?

Aug 29 '23 07:08 FedeDP

Seems like a nice improvement, tagging our sampling ratio expert @gnosek! Moreover, we are in the process of releasing a new libs version, not sure we can do this for the next tag but we will try to do our best!

Aug 29 '23 09:08 Andreagit97

Sorry to say, I'm not too excited about this PR as it currently stands. The issues I have with it are:

we really need DROP_[EX] events to contain the sampling ratio (getting rid of the sampling ratio obviously complicates this :))

replacing "the" sampling ratio with an array feels wrong. With your patch, the sampling ratio now becomes an attribute of the individual syscalls, so maybe it should become an extension of the ppm_sc_of_interest concept (accept/drop becomes accept/accept-1/nth-of-the-time/drop) rather than an extension of the global sampling ratio concept?

My suggestion would be to leave the global sampling ratio alone and add a separate per-syscall sampling step afterwards, so that the higher sampling ratio (global or per-syscall) wins, although you'd realistically only use one or the other.

Also, what's the end use case for this? What would you use the extra flexibility for?

My end use case: The user process makes some unreasonable system calls, such as calling accpet on a non-blocking socket, which generates a large number of useless events, so we have to enable sampling, but at the same time we don't want to lose other useful system calls, such as 'sendto ', as they are important to the upper-level rules, here are two ways:

Provide a method to protect certain system calls from being discarded during sampling
Provide a method to set an individual sampling rate, and set the protected system call to 100%, while providing more options other than 100% This PR uses the second method. In fact, I have tried both methods.

So, I would like to know: 1.Do you think the above usage scenarios need to be met? 2.If yes, which of the above two methods do you think is more suitable?

Aug 30 '23 03:08 wangyongfeng5

Uhm the scenario is clear now, thank you!

Let's say, if you want to enable only some syscalls and disable others that are noisy, we have a solution already in place, "the interesting syscall" logic. As you may imagine you pass to the drivers a simple table saying which syscalls you want.

if I understood well here you are searching for something slightly different. You don't want to disable sycalls you just want to sampling them because maybe some accept events (as in your example) could be useful so you don't want to drop them all. If this is the case maybe we could follow the approach 1, i.e. "allow the driver users to specify which syscall they don't want to be sampled". Probably approach 1 is more convenient because we already have almost all in place (see UF_NEVER_DROP and UF_ALWAYS_DROP flags). We just need to add a custom logic to allow users to put these flags on the various syscalls.

WDYT? @gnosek @FedeDP @wangyongfeng5

Aug 30 '23 12:08 Andreagit97

if I understood well here you are searching for something slightly different. You don't want to disable sycalls you just want to sampling them because maybe some accept events (as in your example) could be useful so you don't want to drop them all.

As far as i understood, @wangyongfeng5 just wants to drop all of them; in that scenario, syscalls of interest feature is enough. In the scenario @Andreagit97 imagined, i agree with him, approach 1 is the most likely to be implemented :)

Aug 30 '23 13:08 FedeDP

Uhm the scenario is clear now, thank you!

Let's say, if you want to enable only some syscalls and disable others that are noisy, we have a solution already in place, "the interesting syscall" logic. As you may imagine you pass to the drivers a simple table saying which syscalls you want.

if I understood well here you are searching for something slightly different. You don't want to disable sycalls you just want to sampling them because maybe some accept events (as in your example) could be useful so you don't want to drop them all. If this is the case maybe we could follow the approach 1, i.e. "allow the driver users to specify which syscall they don't want to be sampled". Probably approach 1 is more convenient because we already have almost all in place (see UF_NEVER_DROP and UF_ALWAYS_DROP flags). We just need to add a custom logic to allow users to put these flags on the various syscalls.

WDYT? @gnosek @FedeDP @wangyongfeng5

Yes, if accept does not return EAGAIN, the event is still very important, so we don't want to discard all accept events.

In addition, we are also trying to find a way to filter useless events to replace the indiscriminate discarding of sampling. For example, in kernel mode, that is, before the event data reaches the ring buffer, according to syscall id, return value, pid and some other conditions to filter events.

Aug 30 '23 13:08 wangyongfeng5

Yes, if accept does not return EAGAIN, the event is still very important, so we don't want to discard all accept events.

Oh i see, sorry! Thank you for the explanation.

Aug 30 '23 13:08 FedeDP

In addition, we are also trying to find a way to filter useless events to replace the indiscriminate discarding of sampling. For example, in kernel mode, that is, before the event data reaches the ring buffer, according to syscall id, return value, pid and some other conditions to filter events.

That would be very nice, we thought different times at filtering events kernel side using comm or pid. The only blocker we always faced was the possible overhead on non-filtered events. For example, if for every event we have to check among a list of 50 comm we pay a high price for each event for maybe filtering just 1/1000 of them. But yes maybe finding the right combination could be a great improvement. What I don't like about the sampling is that you lose possible important events and you don't have any way to control it :/

Aug 30 '23 13:08 Andreagit97

In addition, we are also trying to find a way to filter useless events to replace the indiscriminate discarding of sampling. For example, in kernel mode, that is, before the event data reaches the ring buffer, according to syscall id, return value, pid and some other conditions to filter events.

That would be very nice, we thought different times at filtering events kernel side using comm or pid. The only blocker we always faced was the possible overhead on non-filtered events. For example, if for every event we have to check among a list of 50 comm we pay a high price for each event for maybe filtering just 1/1000 of them. But yes maybe finding the right combination could be a great improvement. What I don't like about the sampling is that you lose possible important events and you don't have any way to control it :/

I think it's worth it:

In threat detection scenarios, filtering out events generated by trusted processes should be very useful, because in most production environments, events generated by business processes account for the vast majority, and they are usually harmless.
Regarding the filtering of junk events (such as the accept that returns EAGAIN), if we can support flexible conditions such as combining pid, syscall, direction, errno, etc., we should be able to filter out most of them.

Aug 30 '23 14:08 wangyongfeng5

It feels to me that sampling is the wrong approach here: you do not care about e.g. accept() calls returning -EAGAIN at all, so it would be best to filter them out completely, not (semi) randomly sample them and hope we catch the non-EAGAIN ones.

IMHO what you want is UF_NEVER_DROP on the accept* syscalls (it's weird that we don't have this already...) combined with improvements to drop_nostate_event for kmod (and its counterparts in other engines, I don't know them off the top of my head) so that we can drop accept() syscalls returning -EAGAIN (or maybe just drop everything that returns -EAGAIN and be done with it?)

@Andreagit97 @FedeDP do you think it's realistic to expose UF_ALWAYS_DROP/UF_NEVER_DROP via ppm_sc_of_interest? Again, going from accept/drop to accept/accept-when-not-sampling/drop.

There are several, unfortunately conflicting, viewpoints on syscall/event filtering:

troubleshooting (e.g. sysdig) wants to see everything
security (e.g. falco) wants to reliably see a fairly small subset of events
performance metrics can be fine with subsampling some (but not all) syscalls and extrapolating the metrics from a (known) proportion of sampled data

In threat detection scenarios, filtering out events generated by trusted processes should be very useful, because in most production environments, events generated by business processes account for the vast majority, and they are usually harmless.

... and one 0day later the process connects to somewhere in Lower Elbonia ;)

Regarding the filtering of junk events (such as the accept that returns EAGAIN), if we can support flexible conditions such as combining pid, syscall, direction, errno, etc., we should be able to filter out most of them.

I'm mildly pessimistic about flexible filtering in the kernel, precisely due to the overhead for non-filtered events.

We can benchmark to measure the overhead but it will always depend on the ratio of filtered to non-filtered events (is the non-filtered overhead greater than the overhead of generating events that will get immediately dropped by userspace?)

Long term, I'd love to see the probe generated on demand based on whatever filters we have, but I might not live long enough to see it happen ;)

Aug 30 '23 15:08 gnosek

Uhm the scenario is clear now, thank you!

Let's say, if you want to enable only some syscalls and disable others that are noisy, we have a solution already in place, "the interesting syscall" logic. As you may imagine you pass to the drivers a simple table saying which syscalls you want.

if I understood well here you are searching for something slightly different. You don't want to disable sycalls you just want to sampling them because maybe some accept events (as in your example) could be useful so you don't want to drop them all. If this is the case maybe we could follow the approach 1, i.e. "allow the driver users to specify which syscall they don't want to be sampled". Probably approach 1 is more convenient because we already have almost all in place (see UF_NEVER_DROP and UF_ALWAYS_DROP flags). We just need to add a custom logic to allow users to put these flags on the various syscalls.

WDYT? @gnosek @FedeDP @wangyongfeng5

I created another PR to implement the approach 1

Aug 31 '23 06:08 wangyongfeng5

thank you we will take a look ASAP right now we are a little bit busy for the release

Aug 31 '23 09:08 Andreagit97

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

Nov 29 '23 09:11 poiana

/remove-lifecycle stale

Nov 29 '23 09:11 FedeDP

yeah, it is not clear to us how to collocate this change between the existing ones, we know this is a valuable feature but we don't want to explode the complexity with yet another logic in the drivers... I am moving ti to /TBD until we have clearer ideas, sorry for that

Nov 29 '23 14:11 Andreagit97

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

Feb 27 '24 15:02 poiana

/remove-lifecycle stale

Feb 27 '24 15:02 FedeDP

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

May 27 '24 15:05 poiana

/remove-lifecycle stale

May 28 '24 07:05 Andreagit97

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

Aug 26 '24 10:08 poiana

/remove-lifecycle stale

Aug 27 '24 08:08 Andreagit97

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

Nov 25 '24 10:11 poiana

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

Dec 25 '24 10:12 poiana

/remove-lifecycle rotten

Jan 02 '25 09:01 FedeDP

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

Apr 02 '25 10:04 poiana

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

May 02 '25 10:05 poiana