falco
falco copied to clipboard
[TRACKING] Re-audit container engines for empty container info values (Initial focus on CRI for Kubernetes)
Describe the bug
While no system and mechanism is perfect, re-audit container engines for empty container info values (Initial focus on CRI for Kubernetes).
The motivation is to get to the bottom of why the container enrichment sometimes fails and subsequently find out if we can improve something still.
In addition, I opened a proposal for a formal container engine testing framework https://github.com/falcosecurity/libs/issues/1298.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
/remove-lifecycle stale
Based on improvements made in:
- https://github.com/falcosecurity/libs/pull/1433
- https://github.com/falcosecurity/libs/pull/1575
We will be able to better track cases where the container info is missing (leveraging new metrics and output fields). More testing will be performed in January 2024.
/assign
We just merged https://github.com/falcosecurity/libs/pull/1595 -> Starting with Falco 0.38.0, we will have faster storage of container information into the container cache when running Falco w/ --disable-cri-async. This improvement should significantly impact production environments.
Syscall events are now expected to have significantly fewer missing container fields. However, if a syscall event triggers a rule too close to the container start, before the API call against the container runtime socket has finished (at least 500ms), the Falco alert may still contain missing container image fields.
Longer term, we have identified more improvement opportunities; however they will take more time. See https://github.com/falcosecurity/libs/issues/1708 for tracking (milestone TBD).
Another note: We have also improved our documentation https://falco.org/docs/reference/rules/supported-fields/#field-class-container and state that under certain circumstances there may be a delay: "In instances of userspace container engine lookup delays, this field may not be available yet".
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
/remove-lifecycle stale