libs icon indicating copy to clipboard operation
libs copied to clipboard

[WIP] Collect correct arguments when `setproctitle` is called

Open Andreagit97 opened this issue 2 years ago • 20 comments

What type of PR is this?

/kind feature

Any specific area of the project related to this PR?

/area driver-kmod

/area driver-bpf

/area driver-modern-bpf

Does this PR require a change in the driver versions?

No

What this PR does / why we need it:

Some days ago we faced this strange thing in the PR regarding e2e tests (https://github.com/falcosecurity/libs/pull/967#discussion_r1133483599). The question was, why the modern probe is taking something different respect than the other 2 drivers? the answer is pretty funny.

nginx under the hood calls the setproctitle method, what this method does is quite strange it moves the actual args of the process from mm->arg_start to mm->arg_env, so into the environment variables space, and overwrite the content of mm->arg_start with the "process title". In our case the modern probe prints:

nginx: master process nginx -g daemon off;

this is because it tries to read a string until the first \0 is faced, in this case since args memory and env memory are contiguous we are reading both the process title (nginx: master process) and the exe+args (nginx -g daemon off). The other 2 drivers try to read only the args memory and for this reason, we read just the process title nginx: master process instead of real exe and args.

This patch tries to solve this issue in all three drivers but we have some concerns:

  • in the old probe this part of the code is probably the most fragile, so I had to rewrite it :( it is still too complex for some kernels like 4.14 but I can simplify it a little bit! Btw this is always the same topic we have here https://github.com/falcosecurity/libs/issues/940, changes like these that touch fillers like proc_startupdate are very dangerous from the stability point of view. This is why we were evaluating a partial refactor...
  • The actual patch takes inspiration from there https://elixir.bootlin.com/linux/latest/source/mm/util.c#L965 (this is a kernel helper that tries to address the issue of setproctitle function). What this function does is check if there is a null terminator at the end of the args memory, if no, it considers this area modified by setproctitle and checks for the real args into the env memory. BTW it could happen that for some reason args memory misses the final terminator, this would mean that we read the env memory even if setproctitle was not called and so we read junk...take a look at the forkX_father_setproctitle test in drivers_tests

Before proceeding in fixing the verifier losing some days (:laughing:) we would be sure that this is what we want :) WDYT @LucaGuerra @Molter73 @incertum ?

Which issue(s) this PR fixes:

Fixes #988

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Andreagit97 avatar Mar 14 '23 15:03 Andreagit97

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Andreagit97

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

poiana avatar Mar 14 '23 15:03 poiana

Wow extraordinary research work :heart: , need to look more into it, hence only posting some initial thoughts:

  • How common is setproctitle usage? Any ideas?
  • Yes, we should invest into the improvement because exe, exepath, proc name and args and env and the like are all super important fields for many Falco rules
  • Initial concern is around what you mentioned that we could "read the env memory even if setproctitle was not called" ... curious if we could find some smarter checks

incertum avatar Mar 15 '23 07:03 incertum

How common is setproctitle usage? Any ideas?

Not sure about that but as we saw from e2e tests at least nginx use it :/ in that case, we were collecting the proc title nginx: master process

Yes, we should invest into the improvement because exe, exepath, proc name and args and env and the like are all super important fields for many Falco rules

agree with that :+1:

Initial concern is around what you mentioned that we could "read the env memory even if setproctitle was not called" ... curious if we could find some smarter checks

we will try to find something else :mag:

Andreagit97 avatar Mar 15 '23 08:03 Andreagit97

Maybe in Dec or early next year we could revive this discussion?

incertum avatar Nov 13 '23 04:11 incertum

Just checking in here: How do we all feel wrt prioritization? Since this PR has been created a while ago.

incertum avatar Feb 06 '24 17:02 incertum

No super priority at the moment IMO

Andreagit97 avatar Feb 07 '24 09:02 Andreagit97

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar May 07 '24 09:05 poiana

/remove-lifecycle stale

Andreagit97 avatar May 07 '24 13:05 Andreagit97

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar Aug 05 '24 16:08 poiana

/remove-lifecycle stale

Andreagit97 avatar Aug 06 '24 07:08 Andreagit97

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar Nov 04 '24 10:11 poiana

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

poiana avatar Dec 04 '24 10:12 poiana

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community. /close

poiana avatar Jan 03 '25 10:01 poiana

@poiana: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community. /close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

poiana avatar Jan 03 '25 10:01 poiana

/reopen /remove-lifecycle rotten

FedeDP avatar Jan 03 '25 10:01 FedeDP

@FedeDP: Reopened this PR.

In response to this:

/reopen /remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

poiana avatar Jan 03 '25 10:01 poiana

Please double check driver/SCHEMA_VERSION file. See versioning.

/hold

github-actions[bot] avatar Jan 03 '25 10:01 github-actions[bot]

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar Apr 03 '25 16:04 poiana

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

poiana avatar May 03 '25 16:05 poiana

/remove-lifecycle rotten

FedeDP avatar May 05 '25 06:05 FedeDP

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

poiana avatar Aug 03 '25 10:08 poiana

Stale issues rot after 30d of inactivity.

Mark the issue as fresh with /remove-lifecycle rotten.

Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle rotten

poiana avatar Sep 02 '25 10:09 poiana

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community. /close

poiana avatar Oct 02 '25 10:10 poiana

@poiana: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue with /reopen.

Mark the issue as fresh with /remove-lifecycle rotten.

Provide feedback via https://github.com/falcosecurity/community. /close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

poiana avatar Oct 02 '25 10:10 poiana