tetragon icon indicating copy to clipboard operation
tetragon copied to clipboard

Container enrichment for plain Docker on Tetragon

Open rnosal opened this issue 3 months ago • 5 comments

Is there an existing issue for this?

  • [x] I have searched the existing issues

Is your feature request related to a problem?

I am reporting a feature request to extend the functionality of Tetragon to include container metadata (name, ID) in events for processes running in standard Docker containers on non-Kubernetes systems.

Currently, I am experiencing an issue where this metadata is not being populated, which makes it difficult to correlate events with specific containers. A similar issue was reported and addressed in https://github.com/cilium/tetragon/issues/2639, but the solution was not tested in a plain Docker environment, which is the system I am using.

The feature would be highly beneficial for users who rely on Tetragon for security and observability in standard Docker deployments, providing more context and a clearer view of processes within their environments.

Describe the feature you would like

The feature I would like is for Tetragon to correctly identify and include container metadata (name, ID) for events from processes running in standard Docker containers on a non-Kubernetes system. This would involve adapting the existing metadata collection mechanisms to work seamlessly with native Docker environments, ensuring that event data is enriched with the necessary container context.

According to @inliquid, we could implement a field similar to the existing container field, but it would be extracted from the pod struct and placed at a higher level.

Describe your proposed solution

No response

Code of Conduct

  • [ ] I agree to follow this project's Code of Conduct

rnosal avatar Aug 28 '25 08:08 rnosal

👋 I think the container id is already available under docker field, please find below the details.

Retrieving docker container name requires query Docker API (or something equivalent) based on container id. This might be tricky and arguably out of scope.

$ docker ps
CONTAINER ID   IMAGE                             COMMAND                  CREATED          STATUS          PORTS                                     NAMES
...
8691e2e6cf4b   quay.io/cilium/json-mock:v1.3.8   "bash /run.sh"           10 minutes ago   Up 10 minutes                                             starwars-xwing-1
...
{
  "process_exit": {
    "process": {
      "exec_id": "OTc2OWZjZjhkMjJjOjE2MTkzMjM4MTgyMjQ2NDo3NzUwMDA=",
      "pid": 775000,
      "uid": 0,
      "cwd": "/",
      "binary": "/usr/bin/curl",
      "arguments": "https://ebpf.io/applications/#tetragon",
      "flags": "execve rootcwd clone",
      "start_time": "2025-08-29T06:22:54.186707998Z",
      "auid": 4294967295,
      "docker": "8691e2e6cf4bedcd801f276ea75515a", # This is same as the above container ID
      "parent_exec_id": "OTc2OWZjZjhkMjJjOjE2MTkyMDI1NzgwOTY1Mzo3NzQ5NjI=",
      "tid": 775000,
      "in_init_tree": false
    },
    "parent": {
      "exec_id": "OTc2OWZjZjhkMjJjOjE2MTkyMDI1NzgwOTY1Mzo3NzQ5NjI=",
      "pid": 774962,
      "uid": 0,
      "cwd": "/",
      "binary": "/usr/bin/bash",
      "flags": "execve rootcwd clone",
      "start_time": "2025-08-29T06:22:42.062695853Z",
      "auid": 4294967295,
      "docker": "8691e2e6cf4bedcd801f276ea75515a",
      "parent_exec_id": "OTc2OWZjZjhkMjJjOjE2MTY0MTA2NTI5NTU4Njo3NzI4MjQ=",
      "tid": 774962,
      "in_init_tree": false
    },
    "time": "2025-08-29T06:22:55.425856154Z"
  },
  "node_name": "9769fcf8d22c",
  "time": "2025-08-29T06:22:55.425855321Z"
}

sayboras avatar Aug 29 '25 06:08 sayboras

Hi @sayboras, thank you for the quick and clear explanation.

I understand that the docker field provides the container ID and that getting the name is a separate step. I know it's possible to build automation to perform a docker inspect for every event, but this would require us to correlate data from a secondary source, which we want to avoid.

The primary reason for this feature request is for our security and observability workflow, especially within our GitLab CI/CD environment.

In our setup, GitLab runners create Docker containers with highly contextual names, such as gitlab-runner-project-123-pipeline-456-job-789. This friendly name is crucial because it contains the pipeline and job ID. When we see a suspicious network connection or process from one of these ephemeral containers, the container's name is what allows us to immediately identify the exact project, pipeline, and even the user who triggered the job. The raw container ID is an opaque string that requires a secondary lookup, slowing down our response time (we are running ~50k docker containers daily).

We want Tetragon to be our single source of truth for these critical security events. Relying on a secondary, real-time lookup against the Docker API for every single event would be a major bottleneck and add a point of failure to our log processing pipeline.

Thank you again for your time and for the great work on Tetragon. I hope one day this feature will be available.

rnosal avatar Sep 01 '25 10:09 rnosal

@sayboras

I think the container id is already available under docker field, please find below the details.

Even for ContainerID this field only contains first 15 digits, see here.

inliquid avatar Sep 01 '25 15:09 inliquid

@rnosal maybe add requirement to collect image name and start time as well?

I would think of smth similar to container field but taken out of of pod struct to upper level.

inliquid avatar Sep 01 '25 15:09 inliquid

Yes, that sounds like it's exactly our case. It would provide immediate, actionable context for every event, directly linking it to the specific CI/CD job that triggered it without requiring any secondary lookups. This would be a huge improvement for plain Docker environments like ours. I updated feature request using your idea.

rnosal avatar Sep 03 '25 08:09 rnosal