tracee icon indicating copy to clipboard operation
tracee copied to clipboard

ExecHash feature don't work for all configs

Open AlonZivony opened this issue 3 years ago • 5 comments

Prerequisites

  • [x] This affects latest released version.
  • [x] This affects current development tree (origin/HEAD).
  • [x] There isn't an issue describing the bug.

Select one OR another:

  • [ ] I'm going to create a PR to solve this (assign to yourself).
  • [x] Someone else should solve this.

Bug description

The ExecHash feature is handled using the ProcessEvent function. However, for the handling to work some events need to be passed to user-mode. In current situation, the use of these features does not change which events are passed to the user-mode, which might result a situation where they doesn't pass. This is also true for the pidsInMntns member of Tracee which might not be updated without certain events.

AlonZivony avatar May 31 '22 15:05 AlonZivony

TL;DR is: we should have the events set by default (because they're dependency of all existing events that need procInfo, for example) but with eventConfig.emit = false (so they're not emitted to userland).

This way, sinkEvents won't emit the events in the pipeline, but, still, ProcessEvent will do needed job.

rafaeldtinoco avatar Jun 19 '22 18:06 rafaeldtinoco

But if we do what you suggest here, it results using more events than we have to. I think we want to reduce the amount of eBPF programs loaded to the kernel and the amount of events passed to user-mode for the goal of reducing performance. I don't think that making them default is a problem, but I think we have interest in avoiding unnecessary workload.

AlonZivony avatar Jun 20 '22 07:06 AlonZivony

I think we should divide these values (I know of procInfo and pidsInMntns) to a sub-struct for uninitialized values, and add at least a documentation of how to make these members functionable. I think that in the future we should also create a simple dependency for those who want to use these members which handles all dependencies required for using them.

AlonZivony avatar Jun 20 '22 07:06 AlonZivony

But if we do what you suggest here, it results using more events than we have to. I think we want to reduce the amount of eBPF programs loaded to the kernel and the amount of events passed to user-mode for the goal of reducing performance. I don't think that making them default is a problem, but I think we have interest in avoiding unnecessary workload.

Ok, so all eBPF programs are loaded, and just the ones related to the picked events are attached (and the dependencies).

We're saying that some events are always needed for internal in-memory data to keep things working (the "service" events), correct ? I'm afraid those are always needed (at least with what we currently have).

We should avoid using this kind of technique IF we can keep everything we need in BPF maps only, but for the container creation/deletion logic, as well as procinfo, we have to have both: userland in-memory data and eBPF maps data, because of race conditions during tracee-ebpf initialization and eBPF programs attachment delays.

rafaeldtinoco avatar Jun 20 '22 11:06 rafaeldtinoco

I've discussed with @AlonZivony offline and with his feedback I've assumed that the bug in question is not a real bug but rather a theoretical one.

It involves the use of pidsInMntns for the ExecHash feature and capture. pidsInMntns is used by both ExecHash and capture functionalities.

There was a concern that adding a hash to another event would create an indirect dependency on pidsInMntns and sched_process_exec, which updates it. However, it's important to note that currently captures do require sched_process_exec, and sched_process_exec had itself, so the issue is not affecting anything. The dependency on sched_process_exec is not enforced by Tracee, but since captures already depend on it, there is no immediate concern regarding this matter.

The urgency of addressing this issue is not as high as initially thought.

geyslan avatar Jul 06 '23 19:07 geyslan