cloudbeat icon indicating copy to clipboard operation
cloudbeat copied to clipboard

Add namespace_inode to process resource id

Open jeniawhite opened this issue 2 years ago • 2 comments

Motivation

While working on https://github.com/elastic/security-team/issues/3532, we've decided to handle the namespace_inode at a later stage.

Definition of done

Process resource id should include the namespace_inode.

Please follow the following checklist at the beginning of your work, please comment with a suggested high-level solution. It should include:

  • [ ] Comment describing high level implementation details
  • [ ] Include API and data models
  • [ ] Include assumptions being taken
  • [ ] Provide backward/forward compatibility when changing data model schemas and key constants
  • [ ] Mention relevant individuals with a reason (getting feedback, FYI etc)
  • [ ] Submit a PR for our technical index that includes breaking changes/ new features before closing this ticket.

jeniawhite avatar May 03 '22 10:05 jeniawhite

Clarification: According to @jeniawhite this will require changes in osquery (fetching the namespace_inode). See his detailed comment:

Some comments about the namespace_inode. Attempted to fill in the gaps on what is needed in order to add this capability.

We are using the osquery proc library for the process information that we are collecting. It is parsing the stat file under the /proc/pid folder in order to get the information about the process.

I've read a bit more about namespace_inode at pid_namespaces and found out about the /proc/*pid*/ns/ folder that has information about the namespace of the PID (there are symlinks inside).

Then I've located this demo code that pulls up the information and builds the namespace_inode.

I've looked at process entity_id issue. After browsing the code of elastic-dev, I've noticed that they are building the unique PID out of PID number and timestamp here in this source code.

From my understanding the fetcher is reading the hosts /proc folder in order to get the process information, which means that the namespace_inode will be the same the for the processes: Screen Shot 2022-05-03 at 10 15 08

eyalkraft avatar May 25 '22 14:05 eyalkraft

@tehilashn - Currently, we calculate the resource.id by UUID(clusterId + nodeId + pid). As the PID is not unique we might get a collision, so It's an important addition to our resource.id calculation.

uri-weisman avatar Jul 04 '22 14:07 uri-weisman