netobserv-ebpf-agent
netobserv-ebpf-agent copied to clipboard
TCP flag-based sampling ("Smart sampling")
When sampling is enabled, we might miss important events such as establishment and termination of TCP connections. It may be useful to add a setting that makes the agent always sending flows that contain specific TCP flags (e.g. SYN, FIN). This will make sure that no connection will be missed.
Cons:
- This makes it trickier to normalize the bytes/packets counters (multiplying counters by sampling rate)
- If the cluster is flooded with short connections, then most flows will contain the SYN or FIN flags. This makes the sampling ineffective.
cc: @eranra @jotak @praveingk @shach33
wdyt about making it configurable, e.g.
sampling:
syn: 0
fin: 0
default: 50
(<any flag>: <value>)
That would allow to mitigate Cons 2 giving users more control on it. The default values could be this above example, with no sampling on syn/fin, and keep the current default of 50 for the rest
@jotak Just to make sure that I understand your configuration format, considering the following values
sampling:
syn: 0
fin: 10
default: 50
It should be interpreted as:
- send every flow that contains the SYN flag
- send every 10th flow that contains the FIN flag
- send every 50th flow that doesn't contain SYN nor FIN flags ?
@jotak totally agree with the above --- assuming::
(1) @shach33 @praveingk and @msherif1234 will agree that this doesn't add a lot ( or any)of overhead to the eBPF code :-) (2) The user experience (operator) is friendly --- defaulting precisely to what you have above will be perfect
@ronensc yes; with the subtlety that it's not exactly "every Nth flow", as the sampling is probabilistic, not deterministic (as you can see here: https://github.com/netobserv/netobserv-ebpf-agent/blob/c54e7eb9e37e8ef5bb948eff6141cdddf584a6f9/bpf/flows.c#L201-L203) But I'm nitpicking :)
@eranra My understanding is that it adds a little overhead, as we need read TCP headers for every packet, including ones that could otherwise be ignored due to sampling. I guess the perf impact overall will still be largely positive
@jotak maybe the user space ebpf agent will translate the friendly conf to some more straightforward and efficient rules to be used in the kernel. Yes, we need to "look" on every packet but this is what eBPF TC hook is for anyway.