bpftrace icon indicating copy to clipboard operation
bpftrace copied to clipboard

[RFC] Store printf arguments off-stack (in per-CPU scratch buffer)

Open Birch-san opened this issue 5 years ago • 0 comments

Incorporates @mmarchini's RFC https://github.com/iovisor/bpftrace/pull/750.
Incorporates parts of "big strings" https://github.com/iovisor/bpftrace/pull/1360.

I did not implement the probe_read() from https://github.com/iovisor/bpftrace/pull/750, because until str() too is moved off-stack, we won't be doing any memset/memcpy as large as 1024, and thus won't run into that limitation just yet. at any rate, https://github.com/iovisor/bpftrace/discussions/1800 may give us a way to do a "large memset/memcpy" (i.e. not need to resort to probe_read() at all).

My codegen changes are a bit bigger than in https://github.com/iovisor/bpftrace/pull/750, dominated by the handling to "exit if we fail to acquire map".

Both printf() and join() change in this review; it's possible to spin one of those out into a separate PR, but they'll both rely on this shared core.

One thing I need feedback on in this RFC is whether to simplify the codegen of programs like:

printf();
printf();

Or like tests/codegen/if_else_printf.ll:

if () {
  printf();
} else {
  printf();
}

To satisfy the BPF verifier, we check whether we succeeded in acquiring the map. but we end up doing this every time printf() is called, even though it's the same map.

I reckon it'd be good to hoist the map-validation to the start of the program (and do it exactly once), rather than emitting it at the call-site of printf(). should I do this?

Introduce CreateGetScratchMap helper

This is scratch space for bpftrace (off-stack memory in which we can do work such as prepare perf_event buffers), rather than a map that will be exposed directly to the user.
join() and format string calls (printf(), system(), cat()) now use this.

https://github.com/iovisor/bpftrace/pull/1360 demonstrates that str(), buf(), map keys and map values can use this in future.
https://github.com/iovisor/bpftrace/pull/1360 also used it to power map-backed variables (is it still 'scratch' at that point though?). a single per-CPU associative array keyed on variable name could be more appropriate for variable storage.

Let async_error be fatal

Rather than silently continuing down a no-op branch, we should exit() if we fail to acquire a map.

join() now exits on failure to acquire map.

format string off-stack

printf() and friends now compose their perf_event buffer off-stack.

more space on-stack

Originally, BPFTRACE_STRLEN=240 was the largest shellsnoop that we could fit on-stack.

But now we can go as high as BPFTRACE_STRLEN=504:

sudo BPFTRACE_STRLEN=504 ./src/bpftrace -e 'tracepoint:syscalls:sys_enter_write /pid == 76214/ {
  printf("<%s>\n", str(args->buf, args->count));
}'

Originally we were storing both str() and printf() on-stack.
Now we're just storing str() on-stack.

Checklist
  • [x] Language changes are updated in docs/reference_guide.md
    • No language changes
  • [ ] User-visible and non-trivial changes updated in CHANGELOG.md
    • Probably worth mentioning the bigger stack, and the "join() no longer fails open". Haven't updated yet.
  • [x] The new behaviour is covered by tests
    • printf() was already under test, and still works
    • join() was already under test, and still works

Birch-san avatar Apr 18 '21 22:04 Birch-san