bcc
bcc copied to clipboard
KProbes Missing Issue
Our team develop one tool based on bcc to detect dns issue. we register multi KProbes event in network stack to analysis network packets, such as:
kprobe__ip_rcv
kprobe__tcp_rcv_established
kprobe__tcp_rcv_state_process
kprobe__ipt_do_table
kprobe__ip_vs_out
…
We compare the results captured by bcc tools with the packets captured by tcpdump and found that KProbes will lose some events and cause defects in our developed programs.
The kernel code shows that if the current cpu is occupied by other KProbes events, there will be kporbes missing and increase the count of p->nmissed.
if (kprobe_running()) {
kprobes_inc_nmissed_count(p);
}
I dumped the kernel stack when the missing occurred, One situation is that the current cpu is sending a message and happens to enter the KProbes handler. At this time, the NIC receives the packet and enters the interrupt context., Then the KProbes registered on the receive path will be discarded.
The stack as follows:
Aug 11 06:49:56 kernel: lvlin kprobes_inc_nmissed_count p symbol_name: tcp_data_queue, p nmissed: 15
Aug 11 06:49:56 kernel: CPU: 15 PID: 4272 Comm: iperf Kdump: loaded Tainted: P OE 5.4.0-96-generic #109
Aug 11 06:49:56 kernel: Hardware name: Supermicro SYS-6029TP-H-EI012/X11DPT-PS-EI012, BIOS 3.1.2.0 05/31/2019
Aug 11 06:49:56 kernel: Call Trace:
Aug 11 06:49:56 kernel: <IRQ>
Aug 11 06:49:56 kernel: dump_stack+0x6d/0x8b
Aug 11 06:49:56 kernel: ? tcp_data_queue_ofo+0x6d0/0x6d0
Aug 11 06:49:56 kernel: kprobes_inc_nmissed_count.cold+0x40/0x4f
Aug 11 06:49:56 kernel: kprobe_ftrace_handler+0x55/0xf0
Aug 11 06:49:56 kernel: ? tcp_rcv_established+0x1e9/0x670
Aug 11 06:49:56 kernel: ftrace_ops_assist_func+0x8d/0x120
Aug 11 06:49:56 kernel: 0xffffffffc08ad0da
Aug 11 06:49:56 kernel: RIP: 0010:tcp_data_queue+0x5/0x690
Aug 11 06:49:56 kernel: Code: ff 4c 89 e7 e8 0c 88 00 00 41 0f b6 b4 24 ae 06 00 00 e9 63 fe ff ff e8 e9 50 73 ff 66 0f 1f 84 00 00 00 00 00 e8 3b 89 b4 16 <55> 48 89 e5 41 57 41 56 49 89 f6 41 55 41 54 53 48 83 ec 28 65 48
Aug 11 06:49:56 kernel: RSP: 0018:ffffbbbf0cec8c40 EFLAGS: 00000286 ORIG_RAX: 0000009000148f53
Aug 11 06:49:56 kernel: RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000
Aug 11 06:49:56 kernel: RDX: ffff903c34fee0e2 RSI: ffff903c6fb79200 RDI: ffff906b5b176040
Aug 11 06:49:56 kernel: RBP: ffffbbbf0cec8c88 R08: 0000000000000004 R09: ffffbbbf0cec8bf0
Aug 11 06:49:56 kernel: R10: 0000000000000000 R11: 000000001f62c036 R12: ffff906b5b176040
Aug 11 06:49:56 kernel: R13: ffff903c6fb79200 R14: ffff906bad407e00 R15: ffff903c34fee0e2
Aug 11 06:49:56 kernel: ? tcp_data_queue+0x5/0x690
Aug 11 06:49:56 kernel: ? tcp_rcv_established+0x1e9/0x670
Aug 11 06:49:56 kernel: ? tcp_data_queue+0x5/0x690
Aug 11 06:49:56 kernel: ? tcp_rcv_established+0x1e9/0x670
Aug 11 06:49:56 kernel: tcp_v4_do_rcv+0x140/0x200
Aug 11 06:49:56 kernel: tcp_v4_rcv+0xc08/0xd10
Aug 11 06:49:56 kernel: ? tcp_v4_rcv+0x5/0xd10
Aug 11 06:49:56 kernel: ip_protocol_deliver_rcu+0x30/0x1b0
Aug 11 06:49:56 kernel: ip_local_deliver_finish+0x48/0x50
Aug 11 06:49:56 kernel: ip_local_deliver+0xe5/0xf0
Aug 11 06:49:56 kernel: ? ip_protocol_deliver_rcu+0x1b0/0x1b0
Aug 11 06:49:56 kernel: ip_rcv_finish+0x85/0xa0
Aug 11 06:49:56 kernel: ip_rcv+0xbc/0xd0
Aug 11 06:49:56 kernel: ? ip_rcv_finish_core.isra.0+0x3c0/0x3c0
Aug 11 06:49:56 kernel: __netif_receive_skb_one_core+0x88/0xa0
Aug 11 06:49:56 kernel: __netif_receive_skb+0x18/0x60
Aug 11 06:49:56 kernel: process_backlog+0xa9/0x160
Aug 11 06:49:56 kernel: net_rx_action+0x142/0x390
Aug 11 06:49:56 kernel: __do_softirq+0xe1/0x2d6
Aug 11 06:49:56 kernel: irq_exit+0xae/0xb0
Aug 11 06:49:56 kernel: do_IRQ+0x5a/0xf0
Aug 11 06:49:56 kernel: common_interrupt+0xf/0xf
Aug 11 06:49:56 kernel: </IRQ>
Aug 11 06:49:56 kernel: RIP: 0010:read_tsc+0x3/0x10
Aug 11 06:49:56 kernel: Code: 5d c3 90 90 90 90 90 8b 05 66 ba c0 01 c3 66 0f 1f 84 00 00 00 00 00 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 01 f9 <66> 90 48 c1 e2 20 48 09 d0 c3 0f 1f 00 8b 0d aa 63 ef 01 48 89 f8
Aug 11 06:49:56 kernel: RSP: 0018:ffffbbbf0f6ef708 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdd
Aug 11 06:49:56 kernel: RAX: 0000000082259c46 RBX: 00000079e646a611 RCX: 000000000000000f
Aug 11 06:49:56 kernel: RDX: 000000000047b3f2 RSI: ffffbbbf0cf66038 RDI: ffffffffaae2fda0
Aug 11 06:49:56 kernel: RBP: ffffbbbf0f6ef738 R08: 0000000000001000 R09: 0000000000000000
Aug 11 06:49:56 kernel: R10: 0000000000000002 R11: ffffbbbf0f6ef7ac R12: 0000000000028178
Aug 11 06:49:56 kernel: R13: ffffffffaaeb7688 R14: 0047b3f2813079f2 R15: ffffffffffffffff
Aug 11 06:49:56 kernel: ? ktime_get_mono_fast_ns+0x4e/0xa0
Aug 11 06:49:56 kernel: bpf_ktime_get_ns+0xe/0x10
Aug 11 06:49:56 kernel: ? trace_call_bpf+0x77/0xf0
Aug 11 06:49:56 kernel: ? ipt_do_table+0x1/0x620 [ip_tables]
Aug 11 06:49:56 kernel: ? ipt_do_table+0x5/0x620 [ip_tables]
Aug 11 06:49:56 kernel: ? kprobe_perf_func+0x244/0x280
Aug 11 06:49:56 kernel: ? trace_call_bpf+0x77/0xf0
Aug 11 06:49:56 kernel: ? ipt_do_table+0x5/0x620 [ip_tables]
Aug 11 06:49:56 kernel: ? kprobe_dispatcher+0x5d/0x70
Aug 11 06:49:56 kernel: ? aggr_pre_handler+0x45/0x80
Aug 11 06:49:56 kernel: ? ipt_do_table+0x1/0x620 [ip_tables]
Aug 11 06:49:56 kernel: ? kprobe_ftrace_handler+0x96/0xf0
Aug 11 06:49:56 kernel: ? iptable_filter_hook+0x1f/0x30 [iptable_filter]
Aug 11 06:49:56 kernel: ? ftrace_ops_assist_func+0x8d/0x120
Aug 11 06:49:56 kernel: ? kretprobe_hash_unlock+0x2e/0x30
Aug 11 06:49:56 kernel: ? 0xffffffffc08ad0da
Aug 11 06:49:56 kernel: ? ipt_do_table+0x1/0x620 [ip_tables]
Aug 11 06:49:56 kernel: ? ipt_do_table+0x5/0x620 [ip_tables]
Aug 11 06:49:56 kernel: ? iptable_filter_hook+0x1f/0x30 [iptable_filter]
Aug 11 06:49:56 kernel: ? ipt_do_table+0x5/0x620 [ip_tables]
Aug 11 06:49:56 kernel: ? elfcorehdr_read+0x40/0x40
Aug 11 06:49:56 kernel: ? nf_hook_slow+0x45/0xb0
Aug 11 06:49:56 kernel: ? __ip_local_out+0xd7/0x140
Aug 11 06:49:56 kernel: ? ip_forward_options+0x1a0/0x1a0
Aug 11 06:49:56 kernel: ? ip_local_out+0x1d/0x50
Aug 11 06:49:56 kernel: ? __ip_queue_xmit+0x16d/0x410
Aug 11 06:49:56 kernel: ? ip_queue_xmit+0x10/0x20
Aug 11 06:49:56 kernel: ? __tcp_transmit_skb+0x535/0xaf0
Aug 11 06:49:56 kernel: ? tcp_write_xmit+0x3c6/0xc00
Aug 11 06:49:56 kernel: ? tcp_push_one+0x3c/0x40
Aug 11 06:49:56 kernel: ? tcp_sendmsg_locked+0x6ee/0xde0
Aug 11 06:49:56 kernel: ? tcp_sendmsg+0x2d/0x50
Aug 11 06:49:56 kernel: ? inet_sendmsg+0x43/0x70
Aug 11 06:49:56 kernel: ? sock_sendmsg+0x5e/0x70
Aug 11 06:49:56 kernel: ? sock_write_iter+0x93/0xf0
Aug 11 06:49:56 kernel: ? new_sync_write+0x125/0x1c0
Aug 11 06:49:56 kernel: ? __vfs_write+0x29/0x40
Aug 11 06:49:56 kernel: ? vfs_write+0xb9/0x1a0
Aug 11 06:49:56 kernel: ? ksys_write+0x67/0xe0
Aug 11 06:49:56 kernel: ? __x64_sys_write+0x1a/0x20
Aug 11 06:49:56 kernel: ? do_syscall_64+0x57/0x190
Aug 11 06:49:56 kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
my question is Are there other case that would cause KProbes Missing? Is it possible to improve something to avoid the occurrence of KProbes Missing?