zeek
zeek copied to clipboard
add conn entries for all tunnels and differentiate from non-tunnel conns
Currently, many tunnel types do not create conn entries. I believe the root of this issue is #3915 and if that behavior is resolved, this issue would be mostly resolved. Currently, VXLAN and GENEVE tunnels get entries in the conn log (since they are UDP-based), but GRE and IP-in-IP do not. This can be especially confusing since the tunnel_parents for connections in these non-UDP tunnels have a uid, but no associated conn entry. I also am not seeing a simple way to differentiate conn entries that belong to tunnels from connections that have not been identified as tunnels (it may be useful to exclude tunnels from some calculations to avoid double-counting the traffic, for example).
I am proposing that we add a field to the conn log to allow filtering out sessions that contain other connections (maybe a boolean named is_tunnel
or something similar?). I am also proposing that we track these sessions as a 3-tuple in #3915. I think this issue is a superset of #3378.
There are also some special cases that we might want to consider for tunnels:
For UDP based tunnels like VXLAN or GENEVE, I think it would be best to ignore the src/dst ports for session matching, as it is not actually part of the tunnel session but rather used like a flow-specific key to allow ECMP on flows within the tunnel. So, a GENEVE session could actually consist of host-a:*->host-b:6081
and host-b:*->host-a:6801
, however I'm unsure of the best way to handle this in zeek. It won't make a difference in cases where these tunnels are used for mirroring purposes, but if bidirectional traffic exists there will be a separate tunnel for each direction (this may or may not be an issue, though - edit: could there be more bugs like #1991?).
GRE has a concept of a tunnel key to differentiate different sessions between two endpoints, however since zeek isn't really exposing the key I think it would be reasonable to ignore that special case for now (there would be no loss of functionality), although I suppose the 4-byte key could also be encoded into orig_p/resp_p somehow.