Socket tracer and tcp stats connector attach conflicting BPF probes
The Socket tracer and TCP stats connector can't be enabled together. Since Vizier v0.14.12 (from https://github.com/pixie-io/pixie/pull/1989), the socket tracer adds a BPF probe to tcp_sendmsg which conflicts with the TCP stats connector's probe.
Reproducing the issue
Running the PEM or stirling_wrapper with the following cli flag --stirling_sources=socket_tracer,tcp_stats results in this error output:
I20241125 20:56:57.037268 3740444 source_connector.cc:35] Initializing source connector: tcp_stats
I20241125 20:56:57.037304 3740444 kernel_version.cc:82] Obtained Linux version string from `uname`: 5.15.0-1067-gke
I20241125 20:56:57.037322 3740444 linux_headers.cc:395] Detected kernel release (uname -r): 5.15.0-1067-gke
I20241125 20:56:57.037364 3740444 linux_headers.cc:206] Using Linux headers from: /lib/modules/5.15.0-1067-gke/build.
I20241125 20:56:57.037444 3740444 bcc_wrapper.cc:166] Initializing BPF program ...
I20241125 20:57:00.180481 3740444 scoped_timer.h:48] Timer(init_bpf_program) : 3.14 s
cannot create /var/tmp/bcc
WARNING: cannot get prog tag, ignore saving source with program tag
cannot attach kprobe, File exists
W20241125 20:57:00.208313 3740444 stirling.cc:416] Source Connector (registry name=tcp_stats) not instantiated, error: Internal : Unable to attach kprobe for tcp_sendmsg using probe_entry_tcp_sendmsg
Background
While BCC does support multiple kprobes for the same process and kernel function, this is only available via the perf_event kprobe API. BCC tries to optimistically use this API and falls back to the text based API (/sys/kernel/tracing/kprobe_events). In Pixie's case, the text based method is used since we specify maxactive. This is because the perf_event API doesn't support maxactive and as a result BCC always uses the text based kprobe (source).
Note: there was an effort to add maxactive support to the perf_event API (kernel patch). This change never made it upstream because of concerns with it being rendered obsolete with the newer rethook implementation.
Solutions to consider
- Add configuration toggle to disable the socket tracer from probing
tcp_sendmsg- This would be considered a stop gap solution for users that want to run both source connectors while a longer term solution is implemented
- Use BCC's kfuncs [1] to allow for both probes to exist
- This requires validation that kfuncs work with multiple probes but from my initial research this seems likely. It requires Linux 5.5+ so the tradeoff of how to leverage this would need to be thought through (e.g. tcp stats could be migrated and socket tracer could be left on kprobes since the former is not enabled in Pixie by default).
[1] Example application using BCC kfunc probe