Populate client side trace's local address via tcp kprobes
Summary: Populate client side trace's local address via tcp kprobes
This change populates client side trace's local_addr and local_port columns for the following use cases:
- To provide more consistency for the protocol data tables. Having columns that are empty make it difficult for end users to understand what is being traced and make them less useful
- To facilitate addressing a portion of the short lived process problems (#1638)
For 2, the root of the issue is that df.ctx["pod"] syntax relies on the px.upid_to_pod_name function. If a PEM misses the short lived process during its metadata update, this function fails to resolve the pod name. For client side traces where the pod is making an outbound connection (non localhost), the local_addr column provides an alternative pod name lookup for short lived processes when the pod is long lived. This means the following would be equivalent to the df.ctx["pod"] lookup: px.pod_id_to_pod_name(px.ip_to_pod_id(df.local_addr)).
I intend to follow this PR with a compiler change that will make df.ctx["pod"] try both methods should px.upid_to_pod_name fail to resolve. This will allow the existing pxl scripts to display the previously missed short lived processes.
Alternatives
Another approach I considered was expanding our use of the sock_alloc kprobe. I used ftrace on a simple curl command to see what other options could be used (sudo trace-cmd record -F -p function_graph http://google.com). The socket syscall calls sock_alloc, which would be another mechanism for accessing the struct sock. I decided against this approach because I don't think its viable to assume that the same thread/process that calls socket will be the one that does the later syscalls (how our BPF maps are set up). It's common to have a forking web server model, which means a different process/thread can call socket than the ones that later read/write to it.
Probe stability
These probes appear to be stable from our oldest and newest supported kernel. These functions exist in the tcp_prot, tcpv6_prot structs and I've seen that other projects and bcc tools use these probes. This makes me believe that these functions have a pretty well defined interface.
Relevant Issues: #1829, #1638
Type of change: /kind feature
Test Plan: New tests verify that ipv4 and ipv6 cases work
-
[x] Ran
for i in $(seq 0 1000); do curl http://google.com/$i; sleep 2; donewithin a pod and verified thatlocal_addris populated with this change andpx.pod_id_to_pod_name(px.ip_to_pod_id(df.local_addr))works for pod name resolution. -
[x] Verified the above curl test results in traces without
local_addrwithout this change
Changelog Message: Populate socket tracer data table local_addr and local_port column for client side traces.
@oazizi000 would appreciate if you could give this a review. I still need to fix the bpf instruction issue for the 4.x kernels and the incorrect storage of the local port (it is stored in network byte order and is an existing issue), but that shouldn't impact the core details in the PR.
Approach looks good. We should just manually test on a few more endpoints to gain more confidence. If that all looks good, then this should be good to go.
@oazizi000 I've tested this change on EKS (Amazon Linux 2), GKE Ubuntu and GKE COS in addition to tracking down the endianness problem (kernel data structure stores the local port in host byte order).
I've also created #2002 to make that byte order more clear even though that case doesn't have a bug.