falco
falco copied to clipboard
Undetected container/k8s information leads to alert fatigue caused by k8s_containers macro
Hello Falco team,
During evaluating Falco on AKS k8s clusters, my team and I observed a continuous alert generation triggered by rule Contact K8S API Server From Container.
The huge amount of generated alerts appear to be false positive, due it was caused by wrongly evaluating of the macro k8s_containers that was caused by missing value for the attribute container.image.repository.
Describe the bug
Once least privileged falco was deployed to Azure AKS cluster, the rule Contact K8S API Server From Container started to generated alerts each second. The generated alerts do not provide any container/k8s information, so no tracing could be performed.
The only valuable information is the k8s pod's internal IP address that initiated the connection, but that was not enough and did not help falco engine with anything to not trigger the alert.
How to reproduce it
- Deploy least privileged Falco in AKS cluster
- Check the logs generated by falco pod/container
- Notice alerts generated by falco rule
Contact K8S API Server From Container. Those alerts are generated each second. - Notice the alert does not provide any container/k8s information. Even container.id attribute contains as value an empty string
"container.id": ""
Expected behaviour
All container/k8s information, inclusive container.id have to be properly detected and as result macro k8s_containers should successfully evaluated and none false positive alerts be generated.
Evidences
The alerts similar to one listed below are generated in falco pod each second and as result could lead to alert fatigue.
{
"hostname": "aks-lhxynh188x-14626312-vmss000014",
"output": "09:55:34.482587748: Notice Unexpected connection to K8s API Server from container (connection=10.244.11.4:39472->192.168.0.1:443 lport=443 rport=39472 fd_type=ipv4 fd_proto=fd.l4proto evt_type=connect user=<NA> user_uid=4294967295 user_loginuid=-1 process=<NA> proc_exepath= parent=<NA> command=<NA> terminal=0 container_id= container_image=<NA> container_image_tag=<NA> container_name=<NA> k8s_ns=<NA> k8s_pod_name=<NA>)",
"priority": "Notice",
"rule": "Contact K8S API Server From Container",
"source": "syscall",
"tags": [
"T1565",
"container",
"k8s",
"maturity_stable",
"mitre_discovery",
"network"
],
"time": "2024-06-20T09:55:34.482587748Z",
"output_fields": {
"container.id": "",
"container.image.repository": null,
"container.image.tag": null,
"container.name": null,
"evt.time": 1718877334482587748,
"evt.type": "connect",
"fd.lport": 443,
"fd.name": "10.244.11.4:39472->192.168.0.1:443",
"fd.rport": 39472,
"fd.type": "ipv4",
"k8s.ns.name": null,
"k8s.pod.name": null,
"proc.cmdline": "<NA>",
"proc.exepath": "",
"proc.name": "<NA>",
"proc.pname": null,
"proc.tty": 0,
"user.loginuid": -1,
"user.name": "<NA>",
"user.uid": 4294967295
}
}
Environment
Falco deployed on AKS cluster using Falco Helm Chart version 4.4.2.
-
Falco version: Falco version: 0.38.0 (x86_64)
-
System info:
{
"machine": "x86_64",
"nodename": "falco-8b7db",
"release": "5.15.0-1064-azure",
"sysname": "Linux",
"version": "#73-Ubuntu SMP Tue Apr 30 14:24:24 UTC 2024"
}
- Cloud provider or hardware configuration: Azure
- OS:
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
- Kernel:
Linux falco-8b7db 5.15.0-1064-azure #73-Ubuntu SMP Tue Apr 30 14:24:24 UTC 2024 x86_64 GNU/Linux
- Installation method: Deploy to k8s cluster as DaemonSet by using Helm Chart version 4.4.2 using the custom values YAML file
driver:
enabled: true
kind: modern_ebpf
modernEbpf:
leastPrivileged: true
tty: true
falco:
json_output: true
json_include_output_property: true
image:
pullPolicy: Always
repository: falcosecurity/falco-distroless
extra:
args:
- --disable-cri-async
Thanks @aberezovski. The fact that container.id is empty is strange, since we fetch the cgroup info in the kernel and it should always be there. Also a bunch of other data is missing.
For other rules and other events the container.id is not an empty string?
[The container rules macro only checks that it is not equal to "host", we perhaps may want to also add a check for empty strings. But that's not solving the root problem]
Hi @incertum,
Regarding your question above For other rules and other events the container.id is not an empty string?, I observed that weird behaviour only for alerts generated by rule Contact K8S API Server From Container and ONLY when falco is running as non-privileged container.
For other rules and events I observed a random behaviour of showing only container.id value and missing all other attributes' values regarding container, pod and k8s namespace. Check another issue I opened for that #3256
I observed that weird behavior only for alerts generated by rule Contact K8S API Server From Container and ONLY when falco is running as non-privileged container.
Thanks for opening the new issue, best to discuss it there. I tagged some other maintainers.
For other rules and events I observed a random behavior of showing only container.id value and missing all other attributes' values regarding container, pod and k8s namespace.
Known imperfection. There are other open issues. Given we do API calls there can be lookup time delays, plus we still need to improve the container engine in general. It is on my plate for the Falco 0.39.0 dev cycle.
I am also seeing the same message (with different IPs, ports and timestamps but with the same user_uid) when I use --set driver.modernEbpf.leastPrivileged=true. I also tried to set the capabilities manually to something like --set-json containerSecurityContext='{"capabilities":{"add":["CHOWN","DAC_OVERRIDE","DAC_READ_SEARCH","FOWNER","FSETID","KILL","SETGID","SETUID","SETPCAP","LINUX_IMMUTABLE","NET_BIND_SERVICE","NET_BROADCAST","NET_ADMIN","NET_RAW","IPC_LOCK","IPC_OWNER","SYS_MODULE","SYS_RAWIO","SYS_CHROOT","SYS_PTRACE","SYS_PACCT","SYS_ADMIN","SYS_BOOT","SYS_NICE","SYS_RESOURCE","SYS_TIME","SYS_TTY_CONFIG","MKNOD","LEASE","AUDIT_WRITE","AUDIT_CONTROL","SETFCAP","MAC_OVERRIDE","MAC_ADMIN","SYSLOG","WAKE_ALARM","BLOCK_SUSPEND","AUDIT_READ","PERFMON","BPF","CHECKPOINT_RESTORE"]}}' just to check if there is something missing there, but it did not fix the error. If I use --set driver.modernEbpf.leastPrivileged=false I do not see this error.
EDIT: I forgot to mention that my belief is that these log lines are generated for falco itself, i.e. when falco tries to get the metadata from the K8s API server to enrich the system calls.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Provide feedback via https://github.com/falcosecurity/community.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Provide feedback via https://github.com/falcosecurity/community. /close
@poiana: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with
/reopen.Mark the issue as fresh with
/remove-lifecycle rotten.Provide feedback via https://github.com/falcosecurity/community. /close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.