tetragon icon indicating copy to clipboard operation
tetragon copied to clipboard

feat: Add next-generation debug framework for enhanced BPF debugging

Open will-isovalent opened this issue 4 months ago • 5 comments

Introduce a next-generation debug framework that provides structured debug messages from BPF programs via perf buffers instead of the traditional bpf_printk() mechanism.

Why do this? It would be nice to include helperful debugging messages throughout our eBPF codebase. However, until now we had no way to selectively output debugging based on subsystem, and our previous bpf_printk lacked a standardized way of providing much-needed context in its output. Moreover, reading from the tracelog with bpftool can be tedious and error-prone in practice, particularly when juggling multiple terminals. The new debugging framework solves all of these problems by providing a standardized way to encode and submit debug messages to userspace from bpf code. Subsequent commits in this series will leverage the new framework to implement fine-grained subsystem filtering for BPF debug messages.

Key changes:

  • Add PERF_DEBUG build flag and --enable-perf-debug CLI option
  • Create new debug.h header with structured debug event framework
  • Add debug reader in userspace to parse and display structured debug messages
  • Debug reader is implemented via a separate perf event map to avoid overfilling the events buffer and risk losing events
  • Refactor set_in_init_tree() to accept context parameter for debug integration

The new framework provides:

  • Structured debug messages with timestamp, PID, and CPU information
  • Efficient message formatting using BPF snprintf() helper
  • Conditional compilation based on TETRAGON_PERF_DEBUG flag
  • Fallback to traditional bpf_printk() when perf debug is disabled
  • Runtime control via --enable-perf-debug command line flag

will-isovalent avatar Aug 19 '25 20:08 will-isovalent

Debug reader is implemented via a separate perf event map to avoid overfilling the events buffer and risk losing events

given the last discussion with @kevsecurity on ring buffers, would the new ring buffer be better choice in here?

also there's also stdout/stderr stream support added recently to kernel https://lore.kernel.org/bpf/[email protected]/ which I was thinking to use eventually.. might be faster, but I guess it will take some time it hits some customer's stable release ... maybe just keep that in mind, so we could use same framework in future with streams inside

olsajiri avatar Aug 21 '25 10:08 olsajiri

given the last discussion with @kevsecurity on ring buffers, would the new ring buffer be better choice in here?

I should have a PR soon where we default to the bpf ring buffer instead of perf ring buffer (and fallback when told to or not available). I'd imagine you could do the same by copying or refactoring.

But in terms of benefits, bpf ring buffer's reserve and commit approach will save adding more heap maps (as we have a hard upper limit of 64 maps/program and this will become important sometime). Also, one less copy has to be good!

kevsecurity avatar Aug 21 '25 10:08 kevsecurity

Nice I actually wanted to use a ring buffer originally but thought it would be out of scope to add support for it. I'm happy to update the PR when we have initial ringbuffer support.

will-isovalent avatar Aug 21 '25 15:08 will-isovalent

Nice I actually wanted to use a ring buffer originally but thought it would be out of scope to add support for it. I'm happy to update the PR when we have initial ringbuffer support.

Awesome. Shouldn't be long; just some tests to fix.

kevsecurity avatar Aug 21 '25 15:08 kevsecurity

Deploy Preview for tetragon ready!

Name Link
Latest commit db2bdbb071deb86ae703bf4a3bbb83a718d0a86e
Latest deploy log https://app.netlify.com/projects/tetragon/deploys/68a748a1b95d740008e22b0e
Deploy Preview https://deploy-preview-4024--tetragon.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

netlify[bot] avatar Aug 21 '25 15:08 netlify[bot]