plugins
plugins copied to clipboard
k8s_audit plugin terminates due to long token
Describe the bug
I have Falco installed on my Kubernetes cluster via the Helm chart. Occasionally the pods fail due to a problem in the k8s_audit plugin.
Tue Feb 18 15:09:00 2025: Loaded event sources: syscall, k8s_audit
Tue Feb 18 15:09:00 2025: Enabled event sources: k8s_audit, syscall
Tue Feb 18 15:09:00 2025: Opening 'k8s_audit' source with plugin 'k8saudit'
Tue Feb 18 15:09:00 2025: Opening 'syscall' source with Kernel module
Tue Feb 18 15:09:16 2025: An error occurred in an event source, forcing termination...
Syscall event drop monitoring:
- event drop detected: 0 occurrences
- num times actions taken: 0
Error: bufio.Scanner: token too long
As I understand the code, Falco reads the audit events (in my case from a file) and uses the Go library bufio to do so. Internally, a slice is used as a buffer. When the buffer is full, it is dynamically resized. However, if no maximum is specified, the buffer can have a maximum length of 65,536. If an audit event is longer than this, the plugin will terminate.
const (
// MaxScanTokenSize is the maximum size used to buffer a token
// unless the user provides an explicit buffer with [Scanner.Buffer].
// The actual maximum token size may be smaller as the buffer
// may need to include, for instance, a newline.
MaxScanTokenSize = 64 * 1024
startBufSize = 4096 // Size of initial allocation for buffer.
)
See: https://cs.opensource.google/go/go/+/refs/tags/go1.24.0:src/bufio/scan.go;l=77
// Is the buffer full? If so, resize.
if s.end == len(s.buf) {
// Guarantee no overflow in the multiplication below.
const maxInt = int(^uint(0) >> 1)
if len(s.buf) >= s.maxTokenSize || len(s.buf) > maxInt/2 {
s.setErr(ErrTooLong)
return false
}
...
}
See: https://cs.opensource.google/go/go/+/refs/tags/go1.24.0:src/bufio/scan.go;l=196
How to reproduce it
In our audit log file, for example, I find an event with a length of approx. 240,000 characters. In this case, Flux patches the CustomResourceDefinition HelmRelease via the kustomize-controller. The request and response contain a long OpenAPI specification.
The audit policy we use is based on the audit policy used by Amazon EKS.
- level: RequestResponse
resources:
- group: "apiextensions.k8s.io"
...
omitStages:
- "RequestReceived"
Question
Is your recommendation to avoid long log messages in general or is this a possible attack vector? Theoretically, I can stop Falco from sending alerts by creating a single long audit log entry.
Environment
- Falco Chart version: 4.20.0
- Kernel: Linux sanitized 6.1.83-4.ph5 1-photon SMP PREEMPT_DYNAMIC Thu Apr 25 07:51:05 UTC 2024 x86_64 GNU/Linux
- Installation method: Kubernetes