bpftrace icon indicating copy to clipboard operation
bpftrace copied to clipboard

attached_probe: usdt: Use libbpf attachment API

Open Phoenix500526 opened this issue 6 months ago • 2 comments

Use libbpf API instead of bcc. libbpf is not only the maintained path forward, but it allows us to access APIs only available to link based kernel API.

Currently, this PR uses bpf_program__attach_uprobe_opts instead of bpf_program__attach_usdt because BCC employs bpf_attach_uprobe to attach USDT probes, which requires integration with other bpf_usdt_xxx APIs. Libbpf's bpf_program__attach_uprobe_opts is more similar to bpf_attach_uprobe, so I have opted to use bpf_program__attach_uprobe_opts as a replacement for bpf_attach_uprobe. I will gradually transition to using libbpf to replace BCC in AttachedUSDTProbe::make, and at that point, I will revisit whether bpf_program__attach_uprobe_usdt can be used instead of bpf_program__attach_uprobe_opts.

Closes: #4151

Checklist
  • [ ] Language changes are updated in man/adoc/bpftrace.adoc
  • [ ] User-visible and non-trivial changes updated in CHANGELOG.md
  • [ ] The new behaviour is covered by tests

Phoenix500526 avatar May 30 '25 16:05 Phoenix500526

@Phoenix500526 So the plan is to add an additional commit to this PR for the make method?

jordalgo avatar May 30 '25 17:05 jordalgo

@Phoenix500526 So the plan is to add an additional commit to this PR for the make method?

As it mention in #4151 , I prefer to make another pr to solve this problem. Should we open a new issue?

Phoenix500526 avatar May 31 '25 02:05 Phoenix500526

@Phoenix500526 - Sorry for the delay. @amscanne and I will take a look this week if you haven't figured it out already.

jordalgo avatar Jul 23 '25 10:07 jordalgo

@Phoenix500526 - Sorry for the delay. @amscanne and I will take a look this week if you haven't figured it out already.

@jordalgo Thanks! I've solved the most problems. There are three failed test cases so far and I'm taking 4 days off from tomorrow. Please drop me a message if anything comes up.

Phoenix500526 avatar Jul 23 '25 15:07 Phoenix500526

So I'm still investigating the test failures and should have a few suggestions for fixes up soon.

jordalgo avatar Jul 30 '25 15:07 jordalgo

So I'm still investigating the test failures and should have a few suggestions for fixes up soon.

@jordalgo Oh, thanks, that's really great. I also have some new findings. One of the reason causing the failure of index_*-type USDT probe is that the current libbpf doesn't support SIB(Scale Index Base) addressing mode. The libbpf doesn't handle the SIB argument, like "-1@-96(%rbp,%rax,8)", correctly. I've submit a patch to solve this issue.

Phoenix500526 avatar Jul 30 '25 15:07 Phoenix500526

@Phoenix500526 Funny, I was about to post the same finding but you're ahead of me and already posted a patch, which is awesome! IMHO, we don't have to wait for this patch to land before migrating as it feels like a pretty edge-case usage of USDTs (@amscanne or @viktormalik - feel to disagree).

jordalgo avatar Jul 30 '25 16:07 jordalgo

@Phoenix500526 So the Scale Index Base issue is accounting for 2 of the 3 test failures. The last one duplicated markers is failing because we're attaching two probes when we should just be attaching one. I believe that because this PR now uses uprobe-multi a single bpf program will get events from both tracepoints if they have the same name/marker.

If I run ./testprogs/usdt_inlined in the background and then run this:

$ ninja && sudo ./src/bpftrace -e 'usdt:./tests/testprogs/usdt_inlined:tracetest:testprobe { printf("%d\n", arg1); }' -v
Trying to attach probe: usdt:./tests/testprogs/usdt_inlined:tracetest:testprobe
Trying to attach probe: usdt:./tests/testprogs/usdt_inlined:tracetest:testprobe
Attached 2 probes
999
999
100
100
999
999
100
100

You see we have two probes both getting both USDT calls. However, if I hack a bit of code to only make sure to attach one then it works as expected:

$ ninja && sudo ./src/bpftrace -e 'usdt:./tests/testprogs/usdt_inlined:tracetest:testprobe { printf("%d\n", arg1); }' -v
Trying to attach probe: usdt:./tests/testprogs/usdt_inlined:tracetest:testprobe
Attached 1 probes
999
100
999
100

EDIT: So you might be able to get away with hard-coding num_locations to 1: https://github.com/bpftrace/bpftrace/blob/master/src/usdt.cpp#L39

But that also begs the question of if we can remove the bcc_usdt_new_frompid and bcc_usdt_new_frompath calls.

jordalgo avatar Jul 30 '25 17:07 jordalgo

@Phoenix500526 So the Scale Index Base issue is accounting for 2 of the 3 test failures. The last one duplicated markers is failing because we're attaching two probes when we should just be attaching one. I believe that because this PR now uses uprobe-multi a single bpf program will get events from both tracepoints if they have the same name/marker.

If I run ./testprogs/usdt_inlined in the background and then run this:

$ ninja && sudo ./src/bpftrace -e 'usdt:./tests/testprogs/usdt_inlined:tracetest:testprobe { printf("%d\n", arg1); }' -v
Trying to attach probe: usdt:./tests/testprogs/usdt_inlined:tracetest:testprobe
Trying to attach probe: usdt:./tests/testprogs/usdt_inlined:tracetest:testprobe
Attached 2 probes
999
999
100
100
999
999
100
100

You see we have two probes both getting both USDT calls. However, if I hack a bit of code to only make sure to attach one then it works as expected:

$ ninja && sudo ./src/bpftrace -e 'usdt:./tests/testprogs/usdt_inlined:tracetest:testprobe { printf("%d\n", arg1); }' -v
Trying to attach probe: usdt:./tests/testprogs/usdt_inlined:tracetest:testprobe
Attached 1 probes
999
100
999
100

EDIT: So you might be able to get away with hard-coding num_locations to 1: https://github.com/bpftrace/bpftrace/blob/master/src/usdt.cpp#L39

But that also begs the question of if we can remove the bcc_usdt_new_frompid and bcc_usdt_new_frompath calls.

I'm trying to remove these two calls. I noticed a strange thing. IIRC, the libbpf is static linked currently. So the bpf_program__attach_usdt which comes from libbpf.a should be defined in bpftrace. But when I used nm to find this symbol, I got this:

$ nm src/bpftrace | grep bpf_program__attach_usdt
                 U bpf_program__attach_usdt

Actually, it should be T bpf_program__attach_usdt. And as it happens, the libbcc_bpf.so defines a same symbol.

$ nm /lib/x86_64-linux-gnu/libbcc_bpf.so.0 | grep bpf_program__attach_usdt
00000000000dfe10 T bpf_program__attach_usdt

It's obvious that the symbol bpf_program__attach_usdt in my bpftrace is come from libbcc_bpf.so. Can you help me to check this? Just nm your bpftrace to see whether the bpf_program__attach_usdt symbol is T or U. I guess it may be some environment issues on my PC. Thanks. @jordalgo

Phoenix500526 avatar Aug 02 '25 15:08 Phoenix500526

Crazy, it seems like codegen isn't handling the pointer cast correctly. I assume that we rely on the implicit casting during map assignment and other cases. I have all the runtime errors go ahead on my system with this patch:

diff --git a/src/ast/passes/codegen_llvm.cpp b/src/ast/passes/codegen_llvm.cpp
index d831f9d5..af971b2d 100644
--- a/src/ast/passes/codegen_llvm.cpp
+++ b/src/ast/passes/codegen_llvm.cpp
@@ -2868,6 +2868,8 @@ ScopedExpr CodegenLLVM::visit(Cast &cast)
         array = buf;
       }
       return ScopedExpr(b_.CreateLoad(int_ty, array, true));
+    } else if (cast.expr.type().IsPtrTy()) {
+      return ScopedExpr(b_.CreatePtrToInt(scoped_expr.value(), int_ty));
     } else {
       return ScopedExpr(
           b_.CreateIntCast(scoped_expr.value(),

But I wasn't able to test the multi-attach paths, and there may be some other things there. I will push that as a separate PR.

amscanne avatar Aug 07 '25 13:08 amscanne

Summarizing discussion from meeting. @Phoenix500526 will drop the last commit with the macro migration and keep the manual C types. This is apparently all working. We will fix a couple of issues and then separately refactor this code.

amscanne avatar Aug 07 '25 15:08 amscanne