cloudbeat
cloudbeat copied to clipboard
Add namespace_inode to process resource id
Motivation
While working on https://github.com/elastic/security-team/issues/3532, we've decided to handle the namespace_inode
at a later stage.
Definition of done
Process resource id should include the namespace_inode
.
Please follow the following checklist at the beginning of your work, please comment with a suggested high-level solution. It should include:
- [ ] Comment describing high level implementation details
- [ ] Include API and data models
- [ ] Include assumptions being taken
- [ ] Provide backward/forward compatibility when changing data model schemas and key constants
- [ ] Mention relevant individuals with a reason (getting feedback, FYI etc)
- [ ] Submit a PR for our technical index that includes breaking changes/ new features before closing this ticket.
Clarification:
According to @jeniawhite this will require changes in osquery (fetching the namespace_inode
).
See his detailed comment:
Some comments about the
namespace_inode
. Attempted to fill in the gaps on what is needed in order to add this capability.We are using the osquery proc library for the process information that we are collecting. It is parsing the
stat
file under the/proc/pid
folder in order to get the information about the process.I've read a bit more about
namespace_inode
at pid_namespaces and found out about the/proc/*pid*/ns/
folder that has information about the namespace of the PID (there are symlinks inside).Then I've located this demo code that pulls up the information and builds the namespace_inode.
I've looked at process entity_id issue. After browsing the code of
elastic-dev
, I've noticed that they are building the unique PID out of PID number and timestamp here in this source code.From my understanding the fetcher is reading the
hosts
/proc
folder in order to get the process information, which means that thenamespace_inode
will be the same the for the processes:
@tehilashn -
Currently, we calculate the resource.id by UUID(clusterId + nodeId + pid)
.
As the PID is not unique we might get a collision, so It's an important addition to our resource.id calculation.