tcptracer-bpf icon indicating copy to clipboard operation
tcptracer-bpf copied to clipboard

kretprobe on tcp_v4_connect has non-zero nmissed field

Open alban opened this issue 7 years ago • 7 comments

When a kretprobe is installed on a kernel function, there is a maximum limit of how many calls in parallel it can catch (aka "maxactive"). In the case of a eBPF kretprobe, the maxactive is let to the default as defined in kernel/kprobes.c:

        /* Pre-allocate memory for max kretprobe instances */
        if (rp->maxactive <= 0) {
#ifdef CONFIG_PREEMPT
                rp->maxactive = max_t(unsigned int, 10, 2*num_possible_cpus());
#else
                rp->maxactive = num_possible_cpus();
#endif
        }

We can check if some kretprobes are ever missed:

# cat /sys/kernel/debug/tracing/kprobe_profile | grep -vw '0$'
  rtcp_v4_connect                                         1650              28

Every time I start tcptracer-bpf, it misses 28 kretprobes on tcp_v4_connect. I have not noticed any missing "connect" events but this is surprising. I don't know if it is a real bug or if kprobe_profile reports a false positive for some reason.

alban avatar Feb 23 '17 20:02 alban

Every time I start tcptracer-bpf, it misses 28 kretprobes on tcp_v4_connect.

How did you tests this? I don't see "every time I start tcptracer-bpf". I do see miss-hits, but they seem to happen only once after kernel boot (as far as I see kprobe_profile doesn't get reset)?

diff --git a/semaphore.sh b/semaphore.sh
index db07459..09488bb 100755
--- a/semaphore.sh
+++ b/semaphore.sh
@@ -69,7 +69,8 @@ for kernel_version in "${kernel_versions[@]}"; do
         mount -t tmpfs tmpfs /tmp &&
         mount -t debugfs debugfs /sys/kernel/debug/ &&
         make &&
-        ./run'
+        ./run &&
+        cat /sys/kernel/debug/tracing/kprobe_profile | grep -vw "0$"'
 
     # Determine exit code from pod status due to
     # https://github.com/coreos/rkt/issues/2777

^ shows 24 misses (seemingly reliable).

schu avatar Feb 24 '17 08:02 schu

as far as I see kprobe_profile doesn't get reset

I does get reset. The high number of hits then stems from the offset guess phase.

schu avatar Feb 24 '17 09:02 schu

I printed how the nmissed increases during the initialization process:

https://gist.githubusercontent.com/alban/3509b952b69f0ed608ccd6a2f44caa65/raw/2348460451bf00ed1245548d4134e019af96a06e/gistfile1.txt

It seems to increase whenever bpf_probe_read() returns EFAULT (when guessing the network namespace offset).

I was fearing it could cause https://github.com/weaveworks/scope/issues/2379 but it seems unrelated since https://github.com/weaveworks/scope/issues/2379 fails before the "network namespace" step.

alban avatar Mar 23 '17 11:03 alban

@schu added some bpf_trace_printk() in kprobe__tcp_v4_connect() and kretprobe__tcp_v4_connect() to see how they are called during the offset work:

scope-19720 [000] d...   497.511920: : kprobe__tcp_v4_connect - pid_tgid 84662395358472
scope-19720 [000] d...   497.512012: : kretprobe__tcp_v4_connect - pid_tgid 84662395358472
scope-19720 [000] d...   497.512459: : kprobe__tcp_v4_connect - pid_tgid 84662395358472
scope-19720 [000] d...   497.512555: : kretprobe__tcp_v4_connect - pid_tgid 84662395358472
scope-19720 [000] d...   497.513016: : kprobe__tcp_v4_connect - pid_tgid 84662395358472
scope-19720 [000] d...   497.513100: : kretprobe__tcp_v4_connect - pid_tgid 84662395358472
scope-19720 [000] d...   497.513538: : kprobe__tcp_v4_connect - pid_tgid 84662395358472
scope-19720 [000] d...   497.513703: : kretprobe__tcp_v4_connect - pid_tgid 84662395358472
 curl-19834 [000] d...   497.516090: : kprobe__tcp_v4_connect - pid_tgid 85186381368698
scope-19720 [000] d...   497.516345: : kprobe__tcp_v4_connect - pid_tgid 84662395358472
scope-19821 [000] d...   497.519377: : kprobe__tcp_v4_connect - pid_tgid 84662395358573
scope-19821 [000] d...   497.521792: : kprobe__tcp_v4_connect - pid_tgid 84662395358573
 curl-19834 [000] d...   497.529933: : kretprobe__tcp_v4_connect - pid_tgid 85186381368698
scope-19790 [000] d...   497.624384: : kprobe__tcp_v4_connect - pid_tgid 84658100391246
scope-19790 [000] dN..   497.624592: : kretprobe__tcp_v4_connect - pid_tgid 84658100391246

In this example, pid_tgid=84662395358472 is a thread of Scope that does the guessing work. There are other threads of Scope that perform connections at the same time and also other processes (e.g. curl above).

For each kprobe, there should be a kretprobe. For pid_tgid=84662395358472, this is the case until the ~178th connection.

At the same time, I got logs from offsetguess.go

rtcp_v4_connect   174               0
source port
rtcp_v4_connect   175               0
source port
rtcp_v4_connect   176               0
source port
rtcp_v4_connect   177               0
source port
rtcp_v4_connect   178               0
<probe> WARN: 2017/03/23 13:05:00.367957 Error setting up the eBPF tracker, falling back to proc scanning: failed to init perf map for IPv4 events: error guessing offsets: invalid guessing state while guessing source port, got checking expected checked, status.offset_saddr: 4

tcptracer-bpf's guessing code does 178 connections before it fails.

I notice that there are 4 successive kprobes with only 1 kretprobe. This is the problem.

We reproduced it only on the GCE instance with 4.4.0-66-generic and only one "possible" cpu (cpu#0):

# cat /sys/devices/system/cpu/possible
0
# cat /boot/config-4.4.0-66-generic | grep CONFIG_PREEMPT
# CONFIG_PREEMPT is not set

In this case, we should only have maxactive == 1:

        /* Pre-allocate memory for max kretprobe instances */
        if (rp->maxactive <= 0) {
#ifdef CONFIG_PREEMPT
                rp->maxactive = max_t(unsigned int, 10, 2*num_possible_cpus());
#else
                rp->maxactive = num_possible_cpus();
#endif
        }

At a first glance, tcp_v4_connect() is not sleeping (no schedule() or semaphore in this function..) so I don't understand how we can have several calls on an non-preemptive kernel before the function returning.

I don't see know if it is possible to configure a kprobe with a higher maxactive. @iaguis is looking into that.

alban avatar Mar 23 '17 15:03 alban

This explains why I cannot reproduce the issue on my laptop: I have 8 "possible" cpus on my laptop (cat /sys/devices/system/cpu/possible) so it's much more difficult to have so many tcp_v4_connect at the same time.

alban avatar Mar 23 '17 15:03 alban

I don't see know if it is possible to configure a kprobe with a higher maxactive. @iaguis is looking into that.

From what I see in the kernel, there's no way to configure maxactive unless you set it explicitly in struct kretprobe.

This is the output of searching for maxactive in the kernel tree:

iaguis@locke-xps13: ~/git/linux [(v4.10)] 
> rg maxactive
Documentation/kprobes.txt
52:a post_handler, and how to use the maxactive and nmissed fields of
135:register_kretprobe(), the user sets the maxactive field of the
141:spinlock held, maxactive = 1 should be enough.  If the function is
143:or preemption), NR_CPUS should be enough.  If maxactive <= 0, it is
147:It's not a disaster if you set maxactive too low; you'll just miss
433:You must set rp->maxactive appropriately before you call

kernel/kprobes.c
1861:	if (rp->maxactive <= 0) {
1863:		rp->maxactive = max_t(unsigned int, 10, 2*num_possible_cpus());
1865:		rp->maxactive = num_possible_cpus();
1870:	for (i = 0; i < rp->maxactive; i++) {

include/linux/kprobes.h
177: * User needs to provide a handler function, and initialize maxactive.
178: * maxactive - The maximum number of instances of the probed function that
181: * ignored, due to maxactive being too low.
188:	int maxactive;

samples/kprobes/kretprobe_example.c
75:	.maxactive		= 20,
98:	/* nmissed > 0 suggests that maxactive was set too low. */

I only see the place where it's set (see previous comment), an example of kernel module where it's set directly in struct kretprobe, documentation, and the definition of struct kretprobe.

iaguis avatar Mar 23 '17 16:03 iaguis

We just asked about this in iovisor-dev: https://lists.iovisor.org/pipermail/iovisor-dev/2017-March/000694.html

iaguis avatar Mar 23 '17 16:03 iaguis