libs
libs copied to clipboard
new(libsinsp): add concatenated process lineage filter fields + sinsp_filter_check_thread helpers cleanup
What type of PR is this?
Uncomment one (or more)
/kind <>lines:
/kind bug
/kind cleanup
/kind design
/kind documentation
/kind failing-test
/kind feature
Any specific area of the project related to this PR?
Uncomment one (or more)
/area <>lines:
/area API-version
/area build
/area CI
/area driver-kmod
/area driver-bpf
/area driver-modern-bpf
/area libscap-engine-bpf
/area libscap-engine-gvisor
/area libscap-engine-kmod
/area libscap-engine-modern-bpf
/area libscap-engine-nodriver
/area libscap-engine-noop
/area libscap-engine-source-plugin
/area libscap-engine-savefile
/area libscap
/area libpman
/area libsinsp
/area tests
/area proposals
Does this PR require a change in the driver versions?
/version driver-API-version-major
/version driver-API-version-minor
/version driver-API-version-patch
/version driver-SCHEMA-version-major
/version driver-SCHEMA-version-minor
/version driver-SCHEMA-version-patch
What this PR does / why we need it:
- Less loops and more efficient if you want to export all process ancestors up to a certain level anyways
- More convenient and intuitive display of information -> security analysts prefer an output like this /bin/java->/bin/bash->/bin/python->/bin/bash to quickly understand the process origins
- Enhanced threat detection capabilities -> can string match exact lineage / sequence
- Needed in the planned anomaly detection framework
Note: I don't think we need this for the other proc.a* fields aka only for the process names, exe, and exepaths ...
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
new(libsinsp): add concatenated process lineage fields as filter / display fields
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: incertum
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [incertum]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
CC @loresuso @darryk10
Yeah I had similar thoughts that we are getting to a point where there is lots of code duplication in this regard (not limited to the type of fields touched in this PR).
I'll check what makes sense to clean up in this PR!
@Andreagit97 added 3 cleanup commits on top, performing significant consolidation w/ new helpers, beyond the initial request. WDYT and do you have additional ideas?
@incertum
Can you rebase pls? Then the sanitizer CI jobs should run.
It seems that while simply rebasing worked, there are updates needed because a few things changed I suspect ... will look into it soon.
@incertum we still have CI issues. I'll ping you once they are fixed.
While I was reviewing this, I opted for a slightly different methods organization, let me know if it works for you, otherwise feel free to revert it! In the meanwhile I've addressed my own review comments and the new changes should solve the visibility issues you faced after the rebase :)
LGTM, thanks!
Jumping in late because I lost sight of this, but I'm just wondering why a more general-purpose field with EPF_IS_LIST (e.g. fd.types) flag was not considered. It would allow the more performant in, intersectes, and pmatch operations, while still being printable as a comma-separated list (you'd lose the ->, that's true), and wouldn't add yet another edge case to the way we execute field comparisons. For clarity, I have nothing against these changes, I'm just wondering if we could achieve the same thing by using something we already support natively in our model, and if not what are the pros of introducing another "special" field.
yep I agree with Jason, reviewing this it was not clear to me what is the use case of this -> concatenation...it's cool to see but I don't understand the difference with a simple list :thinking:
Jumping in late because I lost sight of this, but I'm just wondering why a more general-purpose field with
EPF_IS_LIST(e.g.fd.types) flag was not considered. It would allow the more performantin,intersectes, andpmatchoperations, while still being printable as a comma-separated list (you'd lose the->, that's true), and wouldn't add yet another edge case to the way we execute field comparisons. For clarity, I have nothing against these changes, I'm just wondering if we could achieve the same thing by using something we already support natively in our model, and if not what are the pros of introducing another "special" field.
I totally agree with reusing EPF_IS_LIST, instead of reinventing the wheel and adding more edge cases.
Moreover, using EPF_IS_LIST would satisfy all points here:
- Less loops and more efficient if you want to export all process ancestors up to a certain level anyways
- More convenient and intuitive display of information -> security analysts prefer an output like this /bin/java->/bin/bash->/bin/python->/bin/bash to quickly understand the process origins
- Enhanced threat detection capabilities -> can string match exact lineage / sequence
- Needed in the planned anomaly detection framework
With the only difference of , instead of ->. Anyway, comma-separated convention is more consistent across all functionality offered by libs, so even in this case I don't see a compelling reason to change this.
Can we support both? The original motivation was that security analysts who triage these logs want the process lineage as human readable string specifically with the -> delimiter. It would complement and be consistent with the fd.name for ip:port tuples display. That's the output field display part.
Re the filtering yes I totally agree as well and I wasn't aware of EPF_IS_LIST.
Can we support both?
I would avoid doing this in the specific field implementation. In the long term, supporting multiple solutions for the same thing will create more trouble than benefits. We already have several similar situations in legacy fields, and honestly, it's a pain.
The original motivation was that security analysts who triage these logs want the process lineage as human readable string specifically with the
->delimiter. It would complement and be consistent with thefd.namefor ip:port tuples display. That's the output field display part.
If the motivation that drives this change is primarily aesthetics, I would rather (a) use EPF_IS_LIST (in this PR) and (b) introduce a generic formatting option that allows the list separator to be specified (in another PR)
If the motivation that drives this change is primarily aesthetics, I would rather (a) use EPF_IS_LIST (in this PR) and (b) introduce a generic formatting option that allows the list separator to be specified (in another PR)
ok let's do this then, any preferences in field naming given these changes?
ok let's do this then, any preferences in field naming given these changes?
Not sure about this. Perhaps, maybe an implicit convention is to use plural nouns (eg. proc.aexes) for lists. But I'm not sure if we like this.
cc @falcosecurity/libs-maintainers
ok let's do this then, any preferences in field naming given these changes?
Not sure about this. Perhaps, maybe an implicit convention is to use plural nouns (eg.
proc.aexes) for lists. But I'm not sure if we like this. cc @falcosecurity/libs-maintainers
or proc.aexe.lineage which doesn't disclose the data type as concat is usually for strings only.
Hey, actually the list type would truly defy the original purpose. We do not need the operators "in", "exists" or "intersects" -- at least not for the use case I am intending to introduce here. I would like to exactly match on a string sub-sequence, for example zsh->bash aka match on all logs that have that sub lineage in the process tree lineage of that nature -- with a list type I cannot easily do that during event filtering (EBPF_IS_LIST isn't meant for sublist matching and for fd.types for example we don't even care about the insertion order). Also isn't the proc.aname in (zsh, bash) already doing what we need in cases where we just want to check if zsh and bash are anywhere in the lineage? Again I would like to match exactly on zsh->bash, which we cannot do with proc.aname today; it's a different use case.
Furthermore, I still don't get why -> is not ok for display, especially I wouldn't even want this to be associated with a list concept. It should be treated like fd.name for ips, e.g. we get a string like this 8.8.8.8:22->127.0.01:2367 for fd.name. proc.env has also it's own string representation format, so do the k8s pod labels have one, meaning I truly don't see why it should be an issue.
Question: Have you worked on incidents and used these type of fields? My feedback and recommendations stem from real-life and analysts feedback and is not my personal bias. The arrow gives clear indication about the direction of the lineage - analysts specifically need this information to resolve incidents. @jasondellaluce could you elaborate so that I understand your perspective better?
EDIT: Asking for more feedback on slack https://kubernetes.slack.com/archives/CMWH3EH32/p1709938232317309
I would like to exactly match on a string sub-sequence, for example
zsh->bashaka match on all logs that have that sub lineage in the process tree lineage of that nature -- with a list type I cannot easily do that during event filtering (EBPF_IS_LISTisn't meant for sublist matching and forfd.typesfor example we don't even care about the insertion order). Also isn't theproc.aname in (zsh, bash)already doing what we need in cases where we just want to check if zsh and bash are anywhere in the lineage? Again I would like to match exactly onzsh->bash, which we cannot do withproc.anametoday; it's a different use case.
The use case is legit, but the proposed solution is sub-optimal. Performing a full-text search of zsh->bash into a very long string with all the lineage will create performance penalties. I would rather go for a new operator for lists to satisfy the specific requirement of this use case.
Furthermore, I still don't get why
->is not ok for display, especially I wouldn't even want this to be associated with a list concept. It should be treated likefd.namefor ips, e.g. we get a string like this8.8.8.8:22->127.0.01:2367forfd.name.proc.envhas also it's own string representation format, so do the k8s pod labels have one, meaning I truly don't see why it should be an issue.
The separtor is a display/formatting issue indeed, not a way to represent the data. Thus, I'm not against -> perse, I'm just against implementing it at the field level for lists (or anything similar to a list)., Moreover, 8.8.8.8:22->127.0.01:2367 is not a list, and using the same separator by default would not help with consistency. Just think of a list of ``8.8.8.8:22->127.0.01:2367`... how would we separate them?
Considering what we have today for the proc name (it is the same for exepath, exe, ...), I would try to understand what would be the end goal here...
Today we have:
proc.name
proc.pname
proc.sname
proc.vpgid.name
proc.aname
proc.concat_aname // in this PR
A possible solution could be
proc.name
proc.aname[]
proc.lineage[]
Where:
proc.nameis the same as todayproc.aname[]which always takes an argument. It could be an index (0,1,2,...) or a specific key (SESSION, PARENT, GROUP). We coverproc.pname,proc.sname,proc.vpgid.name,proc.aname.proc.lineage[]which is a list and could take an argument. if used without arguments it returns a list with the whole process lineage, if with an argument (an index 0,1,2,...) returns the lineage starting from the index. We coverproc.aname,proc.concat_aname.
Moreover, we need to implement other operators on the list, like =, startwith, ... to cover all the use cases we want with the aforementioned filter checks.
In this way, we can uniform our filter checks and solve the issues we have highlighted in this PR with ->. WDYT?
Moreover, we need to implement other operators on the list, like
=, startwith, ...to cover all the use cases we want with the aforementioned filter checks.
:+1:
re https://github.com/falcosecurity/libs/pull/1625#issuecomment-1988173684
proc.aname[] ... proc.aname[] which always takes an argument
This would break the existing use of proc.aname in the filter expression where we traverse all levels up.
proc.name
proc.pname
proc.sname
proc.vpgid.name
proc.aname
Frankly I think the current variants are actually pretty clear, I don't see direct benefits of changing the naming. If you don't mind me asking, what are the benefits? It could also backfire and be less user friendly? WDYT? From my perspective if anything we should create a super generic filtercheck class that let's folks access any thread property at any level etc. Something definitely beyond the more immediate improvement suggested in this PR, but likely worth it and it would be a bit in line with what @Andreagit97 suggested here.
solve the issues we have highlighted in this PR with
->
Forgive me, I still don't understand the issues with -> to display a lineage / linkage? It appears to be the best choice as a process lineage encodes the order and direction and is not just a list. Should we rather talk of EBPF_IS_LINKED_LIST or something that is closer to our internal concept of IP TUPLES -- anything that directly implies the direction. Now if we want an unordered list as well, that's ok, but we definitely need the new EBPF_IS_LINKED_LIST concept. Graphics or tutorials around linked lists always use -> to explain the links.
@leogr
The use case is legit, but the proposed solution is sub-optimal. Performing a full-text search of zsh->bash into a very long string with all the lineage will create performance penalties. I would rather go for a new operator for lists to satisfy the specific requirement of this use case.
I believe it comes down to the algorithm. Many rules search in proc.cmdline, perhaps we could up our game and make string searching way more performant across the board? What's our current Big O and algo?
Strings are extremely powerful, for example we could now also search for ->zsh->bash implying there was a parent to that sub process tree lineage or zsh->bash-> implying that there was a child ... sublists search couldn't support that, BUT ...
@Andreagit97
Moreover, we need to implement other operators on the list, like =, startwith, ... to cover all the use cases we want with the aforementioned filter checks.
This also opens up new searches as we could now search for sublists where the executable path, for example, ended with a string ... something the string representation cannot truly handle the way we would want to.
In summary:
Understanding that the filterchecks are the core of effectively using Falco, I rather would like to expose too many variants and empower end users. Comparatively they have been much easier to maintain. Historically we have had many many more issues in our parsers.
Let's support:
- string representation of the lineage
- new linked list concept
- and why not an unordered list concept as well (internally) for more performant intersection searches (it could make
proc.aname in (x, y, z)even more performant)?
proc.aname[] ... proc.aname[] which always takes an argument
This would break the existing use of
proc.anamein the filter expression where we traverse all levels up.
According to its description in sinsp
When used without any arguments, proc.aname is applicable only in filters and matches any of the process ancestors. For instance, you can use `proc.aname=bash` to match any process ancestor whose name is `bash`
This behavior should be covered by the new proc.lineage, which will provide a list with the whole lineage and so you could do something like (proc.lineage) intersects bash
proc.name proc.pname proc.sname proc.vpgid.name proc.anameFrankly I think the current variants are actually pretty clear, I don't see direct benefits of changing the naming. If you don't mind me asking, what are the benefits? It could also backfire and be less user friendly? WDYT? From my perspective if anything we should create a super generic filtercheck class that let's folks access any thread property at any level etc. Something definitely beyond the more immediate improvement suggested in this PR, but likely worth it and it would be a bit in line with what @Andreagit97 suggested here.
I just think that these filter checks are doing almost the same thing in the end, they are accessing a single ancestor in the lineage. Providing users with just one method for doing it, is more intuitive IMO. Of course, this is just my idea let's see what other thinks. Moreover having just one unique method will simplify our lives and will help us keep the code "bug-free". BTW yes this would be a long-term plan,
solve the issues we have highlighted in this PR with
->Forgive me, I still don't understand the issues with
->to display a lineage / linkage? It appears to be the best choice as a process lineage encodes the order and direction and is not just a list. Should we rather talk ofEBPF_IS_LINKED_LISTor something that is closer to our internal concept of IP TUPLES -- anything that directly implies the direction. Now if we want an unordered list as well, that's ok, but we definitely need the newEBPF_IS_LINKED_LISTconcept. Graphics or tutorials around linked lists always use->to explain the links.
Uhm this is not an issue, I used the wrong word sorry, it's just a matter of what Jason well explained here https://github.com/falcosecurity/libs/pull/1625#issuecomment-1985315625. Why do we need to support a new string representation of the lineage with -> when we can do a very similar thing with what we have today?
Let's consider again the new proposed filter check proc.lineage[] + the startwith operation between lists. Writing something like proc.lineage startwith (tail, bash) should do exactly what we are talking about... please note that proc.lineage will contain all the ancestors already in order, so for example proc.lineage -> tail, bash, containerd-shim, systemd. The only thing that changes here is the representation so (tail, bash) instead of (tail->bash), or maybe i'm missing something...
Moreover, as a side note, these days we are working on the possibility of comparing 2 filter checks so proc.name = proc.exe and on the possibility of adding new modifiers like toupper(evt.arg.name) = toupper(fd.path) so having less corner cases to handle would surely simplify things. Just to provide you with an example, at the moment we cannot implement the 2 aforementioned features on all the fields like proc.aexe or proc.aexepath because they are using a custom comparing logic, and so we need to uniform them with others before supporting these new features. So I'm not saying we shouldn't add new approaches or new custom logic, I'm just saying that maybe we could obtain all we need from what we have already in place or that we will try to add in the next weeks.
Hey folks,
It looks like we have a lot of ideas here, but we haven't reached a consensus, so I will try to summarize the discussion, collect feedback, and come back with a design proposal.
/assign
Hey folks,
It looks like we have a lot of ideas here, but we haven't reached a consensus, so I will try to summarize the discussion, collect feedback, and come back with a design proposal.
/assign
SGTM Leo.
Also depending on the proposal we may end up using this PR just for some sinsp_filter_check_thread helpers cleanup and create a new one for the new fields? We will see, whichever makes the most sense.
Proposal [1/2]: introducing join(<list>, <sep>) for concatenating list with a custom separator
This first part of my proposal aims to address the displaying issue raised in this discussion. Later, I will post the [2/2] part of the proposal, which focuses on the filtering issue.
Requirements: This proposal depends on https://github.com/falcosecurity/libs/issues/1789, which needs to be implemented first.
Use case: string representation of the processes lineage (as reported by @incertum)
Solution:
- introducing
proc.lineage, which is a list (ie.EPF_IS_LIST) with the whole process lineage (similar to the @Andreagit97 proposal, we can discuss if this field should support arguments separately) - introducing the
join(<list>, <sep>)transformer, as proposed by me in this comment
As a result, we can use join(proc.lineage, "->") in the output: field of a rule.
Note: This solution might also address the filtering part but with some performance penalty. For instance, one could use join(proc.lineage, "->") contains ->zsh->bash to match a sequence. At the functional level, this is equivalent to what is already implemented by this PR. However, I'd discourage users from doing so because performing a full-text search in a long string usually comes with severe performance penalties (we have seen this in the past). It might still a legit workaround until we have a specific implementation to deal with sequences (I will post my solution in the 2nd part of this proposal).
Action items:
- If we reach a consensus on this part of the proposal, I'd recommend rescoping this PR to only add the required field(s) (e.g.,
proc.lineage).
:pray:
If we reach a consensus on this part of the proposal, I'd recommend rescoping this PR to only add the required field(s) (e.g., proc.lineage).
SGTM, only suggestion would be to perhaps follow existing more specific naming conventions, for instance:
- Option A:
proc.exe.lineage,proc.name.lineageor variants - Option B:
proc.lineage[name],proc.lineage[exe], personally don't like that as it would be something entirely new, but if we intend to expand this new concept then I would be ok with it. - ...
If we reach a consensus on this part of the proposal, I'd recommend rescoping this PR to only add the required field(s) (e.g., proc.lineage).
SGTM, only suggestion would be to perhaps follow existing more specific naming conventions, for instance:
- Option A:
proc.exe.lineage,proc.name.lineageor variants- Option B:
proc.lineage[name],proc.lineage[exe], personally don't like that as it would be something entirely new, but if we intend to expand this new concept then I would be ok with it.- ...
Option A is acceptable for me as an extension of my proposal.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
/remove-lifecycle stale