libs icon indicating copy to clipboard operation
libs copied to clipboard

new: introduce `is_exe_upper_layer`

Open loresuso opened this issue 3 years ago • 16 comments

What type of PR is this?

Uncomment one (or more) /kind <> lines:

/kind bug

/kind cleanup

/kind design

/kind documentation

/kind failing-test

/kind feature

Any specific area of the project related to this PR?

Uncomment one (or more) /area <> lines:

/area build

/area driver-kmod

/area driver-ebpf

/area libscap

/area libsinsp

/area tests

/area proposals

What this PR does / why we need it:

This PR tries to deal with overlayfs, a union mount filesystem often used by container runtimes, to understand if a process is executing something that is part of the base image of a container or not. Monitoring this could be relevant from a security standpoint since best practices state that containers should be immutable. For this reason, in many use cases, we don't expect processes in the container to execute something that is not part of the base image, whether that file was created in the container or an existing one was modified. To do so, the PR aims at identifying the layer of the file being executed by a process: if the file belongs to the lower layer, it means that it was part of the base image of the container. Otherwise, the process executed something that is part of the upper layer, and a flag is set upon the execve event.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

To better understand the code, I'd like to give reviewers some useful pointers. First, we must check if the file being executed is part of an overlayfs. We can do that by retrieving the superblock magic of the file and comparing it with the OVERLAYFS_SUPER_MAGIC. You can see here that this constant is used when an overlayfs is mounted. Then, to check the layer of the executable file, we can do some pointer arithmetics, having in mind that the pointer to the inode that we can retrieve form struct file is pointing to the vfs_inode member of the ovl_inode struct, as you can see here.

The last thing that I wanted to discuss is that for now I didn't find a way to retrieve the upper_layer information when scanning the proc filesystem. One possible way to do it could be by inspecting the /proc/<pid>/mounts file and checking if there is an overlayfs entry, which in case we can use to retrieve the upperdir directory and look for the exe file there. However, this would easily work when Falco is installed on the host, but could be a problem when it is running as a container. This is because most of the time the upperdir directory is placed in /var/lib/docker, /var/lib/containerd directories. This means we would need to give the Falco container access to those directories. I would like to know what do you think, and if you like it, I can implement it. Also, I am open to any other possible solution on this!

EDIT: I have implemented this approach

Does this PR introduce a user-facing change?:

new: introduce is_exe_upper_layer

loresuso avatar Apr 11 '22 10:04 loresuso

Hi @loresuso. Thanks for your PR.

I'm waiting for a falcosecurity member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

poiana avatar Apr 11 '22 10:04 poiana

This is really cool @loresuso ! Thanks for this effort!

FedeDP avatar Apr 11 '22 12:04 FedeDP

/ok-to-test

FedeDP avatar Apr 11 '22 12:04 FedeDP

Hey @FedeDP, thank you! I agree with you, maybe we can better document the fact those new fields (is_exe_writable and is_exe_upper_layer) can be trusted only on specific kernel versions? I thought about that too, didn't know how to solve it honestly

loresuso avatar Apr 11 '22 14:04 loresuso

I agree with you, maybe we can better document the fact those new fields (is_exe_writable and is_exe_upper_layer) can be trusted only on specific kernel versions? I thought about that too, didn't know how to solve it honestly

Yeah that seems a reasonable thing to do.
Another way of solving would be to use a string instead of a boolean and returning <NA> where the info is not available, but i don't quite like the idea :)

FedeDP avatar Apr 12 '22 07:04 FedeDP

I agree with you, maybe we can better document the fact those new fields (is_exe_writable and is_exe_upper_layer) can be trusted only on specific kernel versions? I thought about that too, didn't know how to solve it honestly

Yeah that seems a reasonable thing to do. Another way of solving would be to use a string instead of a boolean and returning where the info is not available, but i don't quite like the idea :)

Yeah, I don't like that too, execve gets called pretty much of course, let's not waste space on the ring buffer just for this :)

loresuso avatar Apr 12 '22 08:04 loresuso

Yeah, I don't like that too, execve gets called pretty much of course, let's not waste space on the ring buffer just for this :)

I meant that we can push onto the ring buffer eg: [ -1, 0, 1] where -1 means "NA", 0 means "false" and 1 means "true". Ring buffer would be untouched. What i don't really like is having the filterchecks on a string instead of a boolean, it is much less elegant.

FedeDP avatar Apr 12 '22 08:04 FedeDP

Ah I got it now, sorry! Yeah you are right

loresuso avatar Apr 12 '22 08:04 loresuso

Hey @FedeDP, I think I have addressed all your comments:

  • Stated that is_exe_upper_layer is meaningful if underlying kernel version is above 3.18
  • Added a small function get_exe_inode to retrieve the inode of the executable. It is now used by exe_writable and exe_upper_layer.
  • decided to add a function to properly set the upper_layer flag when scanning procfs. I tested it with docker and containerd and it seems perfectly ok, let me know what do you think on this
  • Noted that the overlays magic constant was defined in a c file before kernel 4.5, and moved to magic.h only after, so to make it work, I defined PPM_OVERLAYFS_SUPER_MAGIC as you can see

Thank you! Your feedback is always much appreciated!!

loresuso avatar Apr 14 '22 16:04 loresuso

/retest

loresuso avatar Apr 21 '22 14:04 loresuso

Hey maintainers! I have rebased this PR again, and I still think is a good feature to add. What do you think? 🙂

loresuso avatar Jul 22 '22 13:07 loresuso

/cc @FedeDP @Andreagit97 @LucaGuerra

loresuso avatar Jul 22 '22 15:07 loresuso

@loresuso: GitHub didn't allow me to request PR reviews from the following users: LucaGuerra.

Note that only falcosecurity members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @FedeDP @Andreagit97 @LucaGuerra

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

poiana avatar Jul 22 '22 15:07 poiana

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: loresuso Once this PR has been reviewed and has the lgtm label, please ask for approval from fededp by writing /assign @fededp in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

poiana avatar Jul 26 '22 09:07 poiana

/milestone 0.10.0

leogr avatar Sep 16 '22 15:09 leogr

One more thought/question: You mentioned that this does not work for all runtimes, would it make sense to mention the runtimes that are compatible in the definitions?

incertum avatar Sep 17 '22 01:09 incertum

@loresuso as modern_bpf is on the near-term roadmap for initial experimental release, would it be possible to already add this new signal to modern_bpf as well? Would be much appreciated. CC @Andreagit97

CC @LucaGuerra would also appreciate adding the exe writable signal to modern_bpf (if possible) to have feature parity across all 3 drivers.

incertum avatar Nov 16 '22 23:11 incertum

Hello everybody, I have reworked this PR addressing the comments and implemented the related version in the modern eBPF probe, so that we have feature parity in all three instrumentation modes. I also tested the old eBPF probe and the kmod on older kernels, like version 4.18 and unfortunately we are not able to retrieve the upper layer information. It seems that the kernel uses a sb_magic that is different in newer kernel and associates it to ext3 in my experiments.

If everyone like the approach of this PR, I can work on finding the minimum version from which it starts to be working and add a comment about it inside the code, but first I would like to hear everyone's opinion since this is a really time consuming task.

Also, I am thinking about ways to introduce tests for this feature. A possible way would be to mount an overlayfs and try to execute something from there and verify that the flag gets set.

loresuso avatar Nov 24 '22 16:11 loresuso

This PR LGTM! Thank you for all the effort you put into it! :tada:

If everyone like the approach of this PR, I can work on finding the minimum version from which it starts to be working and add a comment about it inside the code, but first I would like to hear everyone's opinion since this is a really time consuming task.

Yeah, it would be amazing to find the first working version!

Also, I am thinking about ways to introduce tests for this feature. A possible way would be to mount an overlayfs and try to execute something from there and verify that the flag gets set.

Absolutely, I think we need a test, in this way we can easily detect when this access vfs_inode + sizeof(struct inode)); is no more valid

Andreagit97 avatar Nov 24 '22 17:11 Andreagit97

LGTM label has been added.

Git tree hash: 54081517502da7292c059a57b9aff84e49779ba5

poiana avatar Nov 25 '22 14:11 poiana

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Andreagit97, FedeDP, loresuso

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • ~~OWNERS~~ [Andreagit97,FedeDP]

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

poiana avatar Nov 25 '22 14:11 poiana