"ip6 and tcp[tcpflags] & (tcp-syn) != 0" gets "expression rejects all packets"
I'm not a novice user. I've run this today:
$ sudo tcpdump -ni lo 'ip6 and tcp[tcpflags] & (tcp-syn) != 0' -t
tcpdump: expression rejects all packets
I mean... I know where this comes from. In IPv6 mode tcpdump refuses to look at higher protocol stuff. However, this is plain silly, and usability nightmare. In modern internet there is not many next-headers actually in use. Also, this match is completely implementable in CBPF and EBPF. Maybe it's time to rethink usability nightmares like this?
One (not best) option is to do stuff like tcp6 which could be an alias to ip6[40]
This is a documented problem:
Arithmetic expression against transport layer headers, like tcp[0],
does not work against IPv6 packets. It only looks at IPv4 packets.
It would be nice to have a solution, and to that end it would help to answer two questions:
- What BPF bytecode could iterate over IPv6 headers well enough?
- What filter syntax could generate such bytecode?
Given that ip6 and tcp port 17 generates
(000) ldh [12]
(001) jeq #0x86dd jt 2 jf 9
(002) ldb [20]
(003) jeq #0x6 jt 4 jf 9
(004) ldh [54]
(005) jeq #0x11 jt 8 jf 6
(006) ldh [56]
(007) jeq #0x11 jt 8 jf 9
(008) ret #262144
(009) ret #0
for Ethernet, I don't see why similar code couldn't be produced for ip6 and tcp[tcpflags] & (tcp-syn) != 0, as both tcp port 17 and tcp[tcpflags] & (tcp-syn) != 0 test a single field in the TCP header.
Neither of those iterate over IPv6 headers; few if any in-kernel BPF implementations allow unbounded loops - the loop isn't really unbounded, as it advances forward through IPv6 headers and would eventually stop when it runs out of packet data, but the BPF code in the kernel doesn't know that.
"expression rejects all packets" is only produced if the optimizer is used, which it is by default in tcpdump.
The un-optimized code generated for ip6 and tcp[tcpflags] & (tcp-syn) != 0 begins with
(000) ldh [12]
(001) jeq #0x86dd jt 2 jf 39
(002) ldh [12]
(003) jeq #0x800 jt 4 jf 39
(004) ldh [12]
(005) jeq #0x800 jt 6 jf 8
(006) ldb [23]
(007) jeq #0x6 jt 16 jf 8
which is pretty much if ip6 && ip && ip which is obviously not going to accept any packets.
The un-optimized code generated for ip6 and tcp port 17, however, begins with
(000) ldh [12]
(001) jeq #0x86dd jt 2 jf 23
(002) ldh [12]
(003) jeq #0x86dd jt 4 jf 10
(004) ldb [20]
(005) jeq #0x6 jt 6 jf 10
which avoids those bogus tests for IPv4. Whatever's causing those bogus tests may be the sole cause of the problem. (And even the ip6 and tcp port 17 code is testing the Ethertype twice for 0x86dd, which is pointless.)
I don't see why similar code couldn't be produced for ...
But the general case is more complicated, as both the LHS and RHS of the comparison operator in "expr1 relop expr2" can be fairly arbitrary expressions, with some being above the network layer and some being below the network layer, so a straightforward code generator would have to emit
if (ipv6)
load TCP header data assuming an IPv6 header (and no extension headers)
else
load TCP header data assuming an IPv4 header
for every TCP data fetch. and similar code for UDP or other transport-layer fetches. The current optimizer might not work well with that.