hyperscan icon indicating copy to clipboard operation
hyperscan copied to clipboard

Using logical combination in block mode has low performance

Open pzhang714 opened this issue 4 years ago • 1 comments

Hi, I found a problem with hyperscan 5.2.1. I added the HS_FLAG_QUIET and HS_FLAG_COMBINATION flag to the pcapscan.cc deamon in the example directory, and used the block mode to scan a pcap file with 100000 packets. When scanning, a single thread is used to search the hyperscan database, and the following two expression configuration files are used respectively:

The first type:
There are 10000 expressions, all using the default flag.

1:/abcd1ef/q 2:/abcd1ef/ 3:/abcd3ef/ 4:/abcd4ef/ 5:/abcd5ef/ 6:/abcd6ef/ 7:/abcd7ef/ 8:/abcd8ef/ 9:/abcd9ef/ 10:/abcd10ef/ 11:/abcd11ef/ 12:/abcd12ef/ 13:/abcd13ef/ 14:/abcd14ef/ 15:/abcd15ef/ ... ... 10000:/abcd10000ef/

The second type:
Three subrules are provided, all of which use HS_FLAG_QUIET,

and then generate 10000 logical combinations through these three subrules, where the character 'q' corresponds to HS_FLAG_QUIET flag, character 'c' corresponds to HS_FLAG_COMBINATION flag. 1:/abcd1ef/q 2:/abcd2ef/q 3:/abcd3ef/q 4:/(1|2)/c 5:/(2|3)/c 6:/(2|3)/c 7:/(2|3)/c 8:/(2|3)/c 9:/(2|3)/c 10:/(2|3)/c 11:/(2|3)/c 12:/(2|3)/c 13:/(2|3)/c 14:/(2|3)/c 15:/(2|3)/c ... ... 10000:/(2|3)/c

The "abcd1ef" expression will hit, and the expression of "abcd2ef" ~ "abcd10000ef" will not hit.

After running, the performance of the first configuration is very high, 

while that of the second configuration is very low. The specific results are as follows:

********************************************************* First configuration ********************************* Pattern file: ../config/ policy.conf Compiling Hyperscan databases with 10000 patterns. Hyperscan block mode database compiled in 0.42 seconds. PCAP input file: ../var/top10w.pcap 100000 packets in 1 streams, totalling 3559368 bytes. Average packet length: 35 bytes.

Block mode Hyperscan database size : 1241832 bytes.

Block mode: Total matches: 1102 Match rate: 0.3170 matches/kilobyte Throughput: 3521.71 megabits/sec

********************************************************* The second configuration ********************************* Pattern file: ../config/policy2.conf Compiling Hyperscan databases with 10000 patterns. Hyperscan block mode database compiled in 0.04 seconds. PCAP input file: ../var/top10w.pcap 100000 packets in 1 streams, totalling 3559368 bytes. Average packet length: 35 bytes.

Block mode Hyperscan database size : 804904 bytes.

Block mode:

Total matches: 1102
Match rate: 0.3170 matches/kilobyte
Throughput: 0.82 megabits/sec

Summary: The scan performance of the first configuration is 3521.71 megabits / sec; the second configuration is 0.82 megabits / sec, with a performance difference of more than 4000 times.

Question: Why is performance so low when using logical composition?

pzhang714 avatar Dec 18 '20 03:12 pzhang714

Have you tried deleting 6~10000 as they're all duplicated expressions?

fatchanghao avatar Oct 20 '22 08:10 fatchanghao