libpcap
libpcap copied to clipboard
Filter optimizing HANGS in infinite loop
Converted from SourceForge issue 1969699, submitted by nulldef
well, trying to compile the following filter with optimization enabled gets to an infinite loop, and hangs the process on 100% cpu usage.
try the following: tcpdump -d "(ip[12:4] = 0x0a0a0a0a or ip[16:4] = 0x0a0a0a0a) and (not ((ip[12:4] = 0x0a0a0a0b or ip[16:4] = 0x0a0a0a0b)))"
i know for sure that the problem exists in version 0.9.7 and 0.9.8, and i think it doesn't exists in 0.8.3.
from a little research and debugging i've done-
the optimizer optimizes the code to a state A, then "optimize" it to state B, and then "optimize" it back to state A.
the functions that are "guilt" in that infinite and redundant optimization are 'or_pullup()' and 'opt_j()', both called from 'opt_blks()', which is called infinitely from 'opt_loop()'.
the main difference between A and B versions of the code is the order in which the two right inner expressions are checked. the rest seems to be the same.
good hunting! :-)
Submitted by nulldef
just to remind, the problem exists also in the libpcap 1.0.0.
This bug reproduces on the latest build of tcpdump.
This bug reproduces on: tcpdump version 4.2.1 libpcap version 1.1.1
Confirmed on latest libpcap/tcpdump on Windows too. But not if I try the filter on an AirPcap adapter. I assume it's due to a DLT of IEEE802_11 or something.
Confirmed on latest libpcap/tcpdump on Windows too. But not if I try the filter on an AirPcap adapter. I assume it's due to a DLT of IEEE802_11 or something.
Yes - 802.11 filter code does things the optimizer currently can't handle, so the optimizer is disabled for them.
The optimizer eventually gets into this loop, shuffling tests around and switching between two versions of the generated code:
opt_loop(root, 0) bottom, done=0
-- (000) ldh [12]
[00](001) jeq #0x800 jt 2 jf 23
-- (002) ldx #0xc
-- (003) ld [x + 14]
-- (004) st M[1]
-- (005) ld M[1]
[01](006) jeq #0xa0a0a0a jt 12 jf 7
-- (007) ldx #0x10
-- (008) ld [x + 14]
-- (009) st M[3]
-- (010) ld M[3]
[09](011) jeq #0xa0a0a0a jt 12 jf 23
-- (012) ldx #0x10
-- (013) ld [x + 14]
-- (014) st M[7]
-- (015) ld M[7]
[06](016) jeq #0xa0a0a0b jt 23 jf 17
-- (017) ldx #0xc
-- (018) ld [x + 14]
-- (019) st M[5]
-- (020) ld M[5]
[03](021) jeq #0xa0a0a0b jt 23 jf 22
[07](022) ret #262144
[04](023) ret #0
opt_loop(root, 0) bottom, done=0
-- (000) ldh [12]
[00](001) jeq #0x800 jt 2 jf 23
-- (002) ldx #0xc
-- (003) ld [x + 14]
-- (004) st M[1]
-- (005) ld M[1]
[01](006) jeq #0xa0a0a0a jt 12 jf 7
-- (007) ldx #0x10
-- (008) ld [x + 14]
-- (009) st M[3]
-- (010) ld M[3]
[09](011) jeq #0xa0a0a0a jt 12 jf 23
-- (012) ldx #0xc
-- (013) ld [x + 14]
-- (014) st M[5]
-- (015) ld M[5]
[03](016) jeq #0xa0a0a0b jt 23 jf 17
-- (017) ldx #0x10
-- (018) ld [x + 14]
-- (019) st M[7]
-- (020) ld M[7]
[06](021) jeq #0xa0a0a0b jt 23 jf 22
[07](022) ret #262144
[04](023) ret #0
opt_loop(root, 0) bottom, done=0
-- (000) ldh [12]
[00](001) jeq #0x800 jt 2 jf 23
-- (002) ldx #0xc
-- (003) ld [x + 14]
-- (004) st M[1]
-- (005) ld M[1]
[01](006) jeq #0xa0a0a0a jt 12 jf 7
-- (007) ldx #0x10
-- (008) ld [x + 14]
-- (009) st M[3]
-- (010) ld M[3]
[09](011) jeq #0xa0a0a0a jt 12 jf 23
-- (012) ldx #0x10
-- (013) ld [x + 14]
-- (014) st M[7]
-- (015) ld M[7]
[06](016) jeq #0xa0a0a0b jt 23 jf 17
-- (017) ldx #0xc
-- (018) ld [x + 14]
-- (019) st M[5]
-- (020) ld M[5]
[03](021) jeq #0xa0a0a0b jt 23 jf 22
[07](022) ret #262144
[04](023) ret #0
opt_loop(root, 1) begin
-- (000) ldh [12]
[00](001) jeq #0x800 jt 2 jf 23
-- (002) ldx #0xc
-- (003) ld [x + 14]
-- (004) st M[1]
-- (005) ld M[1]
[01](006) jeq #0xa0a0a0a jt 12 jf 7
-- (007) ldx #0x10
-- (008) ld [x + 14]
-- (009) st M[3]
-- (010) ld M[3]
[09](011) jeq #0xa0a0a0a jt 12 jf 23
-- (012) ldx #0x10
-- (013) ld [x + 14]
-- (014) st M[7]
-- (015) ld M[7]
[06](016) jeq #0xa0a0a0b jt 23 jf 17
-- (017) ldx #0xc
-- (018) ld [x + 14]
-- (019) st M[5]
-- (020) ld M[5]
[03](021) jeq #0xa0a0a0b jt 23 jf 22
[07](022) ret #262144
[04](023) ret #0
At that point we're just moving edges and pulling up OR nodes.
If we then just quit when we get to a point where two passes in a row just moved stuff around, and fall into the next optimizing stage, we then optimize it to:
opt_loop(root, 1) bottom, done=0
-- (000) ldh [12]
[00](001) jeq #0x800 jt 2 jf 11
-- (002) ld [26]
[01](003) jeq #0xa0a0a0a jt 6 jf 4
-- (004) ld [30]
[09](005) jeq #0xa0a0a0a jt 6 jf 11
-- (006) ld [30]
[06](007) jeq #0xa0a0a0b jt 11 jf 8
-- (008) ld [26]
[03](009) jeq #0xa0a0a0b jt 11 jf 10
[07](010) ret #262144
[04](011) ret #0
opt_loop(root, 1) bottom, done=1
-- (000) ldh [12]
[00](001) jeq #0x800 jt 2 jf 11
-- (002) ld [26]
[01](003) jeq #0xa0a0a0a jt 6 jf 4
-- (004) ld [30]
[09](005) jeq #0xa0a0a0a jt 6 jf 11
-- (006) ld [30]
[06](007) jeq #0xa0a0a0b jt 11 jf 8
-- (008) ld [26]
[03](009) jeq #0xa0a0a0b jt 11 jf 10
[07](010) ret #262144
[04](011) ret #0
at which point we haven't make any changes to the code, and give up.
So why did it not do that last optimization until we stopped doing the code shuffling?
Also, if we do another optimizer loop with do_stmts 0, it goes back to looping, only this time shuffling code that doesn't use scratch memory. Are there more control-flow optimization that it's not doing that would just turn it into ret #0, at which point it stops, and then we report that the "expression rejects all packets"?
after intern_blocks()
-- (000) ldh [12]
[00](001) jeq #0x800 jt 2 jf 11
-- (002) ld [26]
[01](003) jeq #0xa0a0a0a jt 6 jf 4
-- (004) ld [30]
[09](005) jeq #0xa0a0a0a jt 6 jf 11
-- (006) ld [30]
[06](007) jeq #0xa0a0a0b jt 11 jf 8
-- (008) ld [26]
[03](009) jeq #0xa0a0a0b jt 11 jf 10
[07](010) ret #262144
[04](011) ret #0
after opt_root()
-- (000) ldh [12]
[00](001) jeq #0x800 jt 2 jf 11
-- (002) ld [26]
[01](003) jeq #0xa0a0a0a jt 6 jf 4
-- (004) ld [30]
[09](005) jeq #0xa0a0a0a jt 6 jf 11
-- (006) ld [30]
[06](007) jeq #0xa0a0a0b jt 11 jf 8
-- (008) ld [26]
[03](009) jeq #0xa0a0a0b jt 11 jf 10
[07](010) ret #262144
[04](011) ret #0
f2d84366a864f7b41f59ef47334f6a53aa914b32 adds a hack to detect that type of loop. I don't consider it a true fix, because it doesn't detect the loop in a more direct fashion, nor does it prevent the problem in the first place.
I'll leave this open pending a better fix, even though the problem should no longer occur.