feat: Add next-generation debug framework for enhanced BPF debugging
Introduce a next-generation debug framework that provides structured debug messages from BPF programs via perf buffers instead of the traditional bpf_printk() mechanism.
Why do this? It would be nice to include helperful debugging messages throughout our eBPF codebase. However, until now we had no way to selectively output debugging based on subsystem, and our previous bpf_printk lacked a standardized way of providing much-needed context in its output. Moreover, reading from the tracelog with bpftool can be tedious and error-prone in practice, particularly when juggling multiple terminals. The new debugging framework solves all of these problems by providing a standardized way to encode and submit debug messages to userspace from bpf code. Subsequent commits in this series will leverage the new framework to implement fine-grained subsystem filtering for BPF debug messages.
Key changes:
- Add PERF_DEBUG build flag and --enable-perf-debug CLI option
- Create new debug.h header with structured debug event framework
- Add debug reader in userspace to parse and display structured debug messages
- Debug reader is implemented via a separate perf event map to avoid overfilling the events buffer and risk losing events
- Refactor set_in_init_tree() to accept context parameter for debug integration
The new framework provides:
- Structured debug messages with timestamp, PID, and CPU information
- Efficient message formatting using BPF snprintf() helper
- Conditional compilation based on TETRAGON_PERF_DEBUG flag
- Fallback to traditional bpf_printk() when perf debug is disabled
- Runtime control via --enable-perf-debug command line flag
Debug reader is implemented via a separate perf event map to avoid overfilling the events buffer and risk losing events
given the last discussion with @kevsecurity on ring buffers, would the new ring buffer be better choice in here?
also there's also stdout/stderr stream support added recently to kernel https://lore.kernel.org/bpf/[email protected]/ which I was thinking to use eventually.. might be faster, but I guess it will take some time it hits some customer's stable release ... maybe just keep that in mind, so we could use same framework in future with streams inside
given the last discussion with @kevsecurity on ring buffers, would the new ring buffer be better choice in here?
I should have a PR soon where we default to the bpf ring buffer instead of perf ring buffer (and fallback when told to or not available). I'd imagine you could do the same by copying or refactoring.
But in terms of benefits, bpf ring buffer's reserve and commit approach will save adding more heap maps (as we have a hard upper limit of 64 maps/program and this will become important sometime). Also, one less copy has to be good!
Nice I actually wanted to use a ring buffer originally but thought it would be out of scope to add support for it. I'm happy to update the PR when we have initial ringbuffer support.
Nice I actually wanted to use a ring buffer originally but thought it would be out of scope to add support for it. I'm happy to update the PR when we have initial ringbuffer support.
Awesome. Shouldn't be long; just some tests to fix.
Deploy Preview for tetragon ready!
| Name | Link |
|---|---|
| Latest commit | db2bdbb071deb86ae703bf4a3bbb83a718d0a86e |
| Latest deploy log | https://app.netlify.com/projects/tetragon/deploys/68a748a1b95d740008e22b0e |
| Deploy Preview | https://deploy-preview-4024--tetragon.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify project configuration.