gatekeeper
gatekeeper copied to clipboard
/8 prefix fib routes test,gatekeeper caught SIGTERM
I simulated /8 prefix fib routes, the number of prefixes was 224, and then launched a syn+ack attack,Use a random source “--rand-source ”. After more than 40 minutes, the GK server appeared a caught SIGTERM. gatekeeper version 1.0.0
Does doubling the number of these routes make it easier to reproduce the problem? Does the problem happen with half of these routes? What are the kinds of these routes? Are they all of the type GK_FWD_GRANTOR
?
There is not enough context information to formulate hypotheses to test to figure the problem out eventually. So it would help if you described your testbed. Doesn't Gatekeeper's log have any clue?
Repeating your experiment with Gatekeeper running under gdb
would likely produce valuable context. Once the SIGTERM
is triggered, gdb
will give you some information about the failure and the opportunity to enter the command bt
. The output of the command bt
likely will include a clue of what's happening.
The Gk log doesn't have any clues, just see that the GK server is restarted.
- Simulate the FIB route prefix imported by the ISP, the prefix is a class A address, and a total of 224 class A addresses are simulated.
- The GK_FWD_GRANTOR forwarding entry is 30.30.30.0/24,
- ddos attack, use random source ip to attack the destination address 30.30.30.30/32, tcp flags syn+ack (-SA mode). Each time a 64-packet request is sent, a source ip is replaced
- Test the attack software hping3
Is it because the mask is too large, lcore calculates the flow table. After about 43 minutes of the simulated attack, the GK server restarts
GK flow table entry is set so large flow_ht_size = 250000000 Is it because the class A address will exceed the total number of entries in the flow table ?
GK flow table entry is set so large flow_ht_size = 250000000 Is it because the class A address will exceed the total number of entries in the flow table ?
How much RAM does the Gatekeeper server have? How much of the RAM is allocated to huge pages?
Is it because the mask is too large, lcore calculates the flow table. After about 43 minutes of the simulated attack, the GK server restarts
Post your parameters in 'lua/net.lua`.
The Gk log doesn't have any clues, just see that the GK server is restarted.
- Simulate the FIB route prefix imported by the ISP, the prefix is a class A address, and a total of 224 class A addresses are simulated.
- The GK_FWD_GRANTOR forwarding entry is 30.30.30.0/24,
- ddos attack, use random source ip to attack the destination address 30.30.30.30/32, tcp flags syn+ack (-SA mode). Each time a 64-packet request is sent, a source ip is replaced
- Test the attack software hping3
This setup is fine. Please run Gatekeeper under gdb
as instructed before.
1 minute after the following log appears, GK server restart.We do not have any operations on the flow on the GK server. Just follow the DDOS attack scenario of this issue to test 1.Simulate the FIB route prefix imported by the ISP, the prefix is a class A address, and a total of 224 class A addresses are simulated. 2.The GK_FWD_GRANTOR forwarding entry is 30.30.30.0/24, 3.ddos attack, use random source ip to attack the destination address 30.30.30.30/32, tcp flags syn+ack (-SA mode). Each time a 64-packet request is sent, a source ip is replaced 4.Test the attack software hping3
[2022-08-16 10:16:58] GATEKEEPER: Flow (src: 181.62.222.216, dst: 30.30.30.216) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x1baf20f8, expire_at: 0xa91f223f21f, last_packet_seen_at: 0xa8f493f3574, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 168.218.231.110, dst: 30.30.30.219) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 168.218.231.110, dst: 30.30.30.219) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xc9f79aac, expire_at: 0xa91f226930b, last_packet_seen_at: 0xa8f4941d662, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 117.208.248.214, dst: 30.30.30.78) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 117.208.248.214, dst: 30.30.30.78) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x3abf01f8, expire_at: 0xa91f2271d35, last_packet_seen_at: 0xa8f4942608a, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 73.181.6.93, dst: 30.30.30.161) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 73.181.6.93, dst: 30.30.30.161) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xead19c74, expire_at: 0xa91f22b6947, last_packet_seen_at: 0xa8f4946ac9c, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 209.207.91.168, dst: 30.30.30.125) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 209.207.91.168, dst: 30.30.30.125) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x25eb2f64, expire_at: 0xa91f22c4583, last_packet_seen_at: 0xa8f494788da, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 108.161.214.124, dst: 30.30.30.26) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 108.161.214.124, dst: 30.30.30.26) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x64925f14, expire_at: 0xa91f22cd2ef, last_packet_seen_at: 0xa8f49481644, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 18.92.110.36, dst: 30.30.30.215) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 18.92.110.36, dst: 30.30.30.215) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x55a90d6c, expire_at: 0xa91f22e59bf, last_packet_seen_at: 0xa8f49499d16, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 25.204.128.30, dst: 30.30.30.106) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 25.204.128.30, dst: 30.30.30.106) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x634eb000, expire_at: 0xa91f22f981d, last_packet_seen_at: 0xa8f494adb74, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 80.24.27.39, dst: 30.30.30.108) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 80.24.27.39, dst: 30.30.30.108) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x882acbf4, expire_at: 0xa91f23021f7, last_packet_seen_at: 0xa8f494b654c, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 61.219.12.131, dst: 30.30.30.229) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 61.219.12.131, dst: 30.30.30.229) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x84dca0f0, expire_at: 0xa91f230abc1, last_packet_seen_at: 0xa8f494bef12, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 25.108.248.27, dst: 30.30.30.214) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 25.108.248.27, dst: 30.30.30.214) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x3eafe80c, expire_at: 0xa91f2313c79, last_packet_seen_at: 0xa8f494c7fd0, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 23.92.208.105, dst: 30.30.30.11) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 23.92.208.105, dst: 30.30.30.11) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xd2ec7f44, expire_at: 0xa91f231d357, last_packet_seen_at: 0xa8f494d176e, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 84.251.228.93, dst: 30.30.30.143) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 84.251.228.93, dst: 30.30.30.143) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xb698c09c, expire_at: 0xa91f232b949, last_packet_seen_at: 0xa8f494dfc9e, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 27.155.110.170, dst: 30.30.30.168) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 27.155.110.170, dst: 30.30.30.168) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x970c6fe0, expire_at: 0xa91f2338ea7, last_packet_seen_at: 0xa8f494ed1fe, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 130.43.11.199, dst: 30.30.30.68) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 130.43.11.199, dst: 30.30.30.68) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xeb167e88, expire_at: 0xa91f233d731, last_packet_seen_at: 0xa8f494f1a8a, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 161.43.224.184, dst: 30.30.30.91) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 161.43.224.184, dst: 30.30.30.91) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xdc35a3bc, expire_at: 0xa91f234ea8d, last_packet_seen_at: 0xa8f49502de0, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 171.17.171.197, dst: 30.30.30.73) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 171.17.171.197, dst: 30.30.30.73) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x76ea13f4, expire_at: 0xa91f23651a9, last_packet_seen_at: 0xa8f495194fe, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 85.189.81.75, dst: 30.30.30.73) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 85.189.81.75, dst: 30.30.30.73) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xf0604f5c, expire_at: 0xa91f2392725, last_packet_seen_at: 0xa8f49546a7a, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 122.232.37.189, dst: 30.30.30.36) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 122.232.37.189, dst: 30.30.30.36) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x4e32e230, expire_at: 0xa91f239687d, last_packet_seen_at: 0xa8f4954abd2, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 220.39.2.199, dst: 30.30.30.249) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 220.39.2.199, dst: 30.30.30.249) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xe8d6ae8c, expire_at: 0xa91f23c4e91, last_packet_seen_at: 0xa8f495791ec, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 136.114.45.219, dst: 30.30.30.212) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 136.114.45.219, dst: 30.30.30.212) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x6ddaac38, expire_at: 0xa91f23db01f, last_packet_seen_at: 0xa8f4958f378, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 205.97.131.92, dst: 30.30.30.182) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 205.97.131.92, dst: 30.30.30.182) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x53ce5b3c, expire_at: 0xa91f23e42df, last_packet_seen_at: 0xa8f49598636, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 198.44.62.87, dst: 30.30.30.117) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 198.44.62.87, dst: 30.30.30.117) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xab10b2ac, expire_at: 0xa91f2426199, last_packet_seen_at: 0xa8f495da4f2, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 10.77.240.209, dst: 30.30.30.217) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 10.77.240.209, dst: 30.30.30.217) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x4feca0c4, expire_at: 0xa91f2438f2d, last_packet_seen_at: 0xa8f495ed284, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 141.23.180.80, dst: 30.30.30.136) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 141.23.180.80, dst: 30.30.30.136) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xa2dd4434, expire_at: 0xa91f2455757, last_packet_seen_at: 0xa8f49609aae, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 34.61.237.170, dst: 30.30.30.94) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 34.61.237.170, dst: 30.30.30.94) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xec044b9c, expire_at: 0xa91f2459ec5, last_packet_seen_at: 0xa8f4960e21a, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 16.75.37.91, dst: 30.30.30.131) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 16.75.37.91, dst: 30.30.30.131) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xeb7bd028, expire_at: 0xa91f245ee37, last_packet_seen_at: 0xa8f4961318c, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 37.85.199.189, dst: 30.30.30.34) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 37.85.199.189, dst: 30.30.30.34) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xf29badcc, expire_at: 0xa91f2462c1b, last_packet_seen_at: 0xa8f4961704e, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 44.124.53.29, dst: 30.30.30.75) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 44.124.53.29, dst: 30.30.30.75) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x590c6cd0, expire_at: 0xa91f2467497, last_packet_seen_at: 0xa8f4961b7ea, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 184.110.54.45, dst: 30.30.30.0) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 184.110.54.45, dst: 30.30.30.0) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xf70f638c, expire_at: 0xa91f246fe53, last_packet_seen_at: 0xa8f496241a6, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 214.30.86.147, dst: 30.30.30.203) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 214.30.86.147, dst: 30.30.30.203) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x121e4090, expire_at: 0xa91f24756f3, last_packet_seen_at: 0xa8f49629a48, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 90.180.180.3, dst: 30.30.30.67) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 90.180.180.3, dst: 30.30.30.67) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x37265db4, expire_at: 0xa91f247f197, last_packet_seen_at: 0xa8f496334ee, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 220.6.67.139, dst: 30.30.30.73) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 220.6.67.139, dst: 30.30.30.73) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xae4ec750, expire_at: 0xa91f24b1699, last_packet_seen_at: 0xa8f496659f0, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 170.110.56.86, dst: 30.30.30.125) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 170.110.56.86, dst: 30.30.30.125) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x8967d1f8, expire_at: 0xa91f24b5eb5, last_packet_seen_at: 0xa8f4966a206, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 128.17.104.31, dst: 30.30.30.164) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 128.17.104.31, dst: 30.30.30.164) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x9781a624, expire_at: 0xa91f24cdfab, last_packet_seen_at: 0xa8f49682300, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 118.10.27.10, dst: 30.30.30.136) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 118.10.27.10, dst: 30.30.30.136) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x983464d0, expire_at: 0xa91f24ee2e9, last_packet_seen_at: 0xa8f496a2640, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 113.229.80.195, dst: 30.30.30.113) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 113.229.80.195, dst: 30.30.30.113) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0xa0a0d678, expire_at: 0xa91f25027fd, last_packet_seen_at: 0xa8f496b6b56, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 197.255.64.124, dst: 30.30.30.167) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 197.255.64.124, dst: 30.30.30.167) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x45f3d77c, expire_at: 0xa91f25069db, last_packet_seen_at: 0xa8f496bad2e, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 83.97.229.143, dst: 30.30.30.155) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 83.97.229.143, dst: 30.30.30.155) at lcore 4: [state: GK_REQUEST (0), flow_hash_value: 0x490c5508, expire_at: 0xa91f251d9d5, last_packet_seen_at: 0xa8f496d1d2c, last_priority: 38, allowance: 6, grantor_ip: 10.0.2.11] [2022-08-16 10:16:58] GATEKEEPER: Flow (src: 94.199.199.11, dst: 30.30.30.65) at lcore 4: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow...
The 1.0.0 master downloaded on May 28 has this method gk_del_flow_entry_at_pos, I installed this version of gk server, This stable version of 1.0.0 does not have gk_del_flow_entry_at_pos. It should be the merged code that caused the exception. I will redeploy the stable version of 1.0.0 for testing
Using stable version 1.0.0, the same DDOS attack scenario, the following error occurs, but the GK server does not restart. [2022-08-16 16:15:05] GATEKEEPER: flow: The GK block failed to delete a key from hash table at gk_del_flow_entry_from_hash: No such file or directory for the flow with IP source address 58.77.92.151, and destination address 30.30.30.132
Hi @ShawnLeung87,
While the new information is useful, please answer my previous requests for information, so I can formulate hypotheses to go about this issue.
Given that Gatekeeper runs in user space, no spontaneous reboot should happen. Since the server is rebooting by itself, a hardware failure may be occurring here. Does your Gatekeeper server have ECC RAM? Could you look for general failures in the kernel log file /var/log/dmesg
? The open source MemTest86 can help you test the RAM of your server.
How are you adding date and time to the log entries?
You say that the address 30.30.30.30/32
is the only target of the attack, but the log entries report other destinations: 30.30.30.113
, 30.30.30.216
, 30.30.30.219
, etc. Can you explain this inconsistency in your report?
All logged flows you reported are at state GK_REQUEST
. While this may not be related to the reboot, it may imply that your Gatekeeper does not have access to your Grantor server. Can you ping the destination without the attack going on, and log the corresponding flow? If this flow is still GK_REQUEST
, your Gatekeeper server is not reaching the Grantor server, or your Grantor server is not reaching back to your Gatekeeper server.
It's recommended to use Gatekeeper 1.0.0 with the patch gk: heal corrupted flow tables #531
in production.
1、When the GK server is abnormal and reloaded, there is no hardware exception information in /var/log/dmesg 2、Using the 1.0.0 stable version gatekeeper without gk_del_flow_entry_at_pos, there is no reload, but there is a log of flow table corruption“[2022-08-16 16:15:05] GATEKEEPER: flow: The GK block failed to delete a key from hash table at gk_del_flow_entry_from_hash: No such file or directory for the flow with IP source address 58.77.92.151, and destination address 30.30.30.132” 3、I simulate an attack on ddos (SYN+ACK) a C segment ip, the source address is also random, 30.30.30.x is a random destination address, in this scenario, the gk flow table exception is most likely to appear 4、"gk: healed flow tables #531" I haven't updated this patch, I will test it again after I update it to see if there is still flow abnormality
If reload occurs in GK, kni will be re-initialized (this process cannot communicate with gk and gt servers), but after reload is completed, bird will be interrupted, and bird will not be started automatically. This exception is very serious. According to the explanation in #531, flow abnormally damages too many logs (when this log occurs, gk and gt can still be accessed normally), it will eventually cause gk to hang, and gatekeeper needs to be restarted to solve it.
1、When the GK server is abnormal and reloaded, there is no hardware exception information in /var/log/dmesg
Check the output of the command dmesg
after the server has logged entries related to failing to remove flow entries. The latest output of the command dmesg
may not have been saved in /var/log/dmesg
.
Does your Gatekeeper server have ECC RAM? Please don't skip my questions. Skipping my questions slows down the diagnosis of your problem.
Let's eliminate the possibility of RAM fault here, test the RAM of your server with MemTest86.
2、Using the 1.0.0 stable version gatekeeper without gk_del_flow_entry_at_pos, there is no reload, but there is a log of flow table corruption“[2022-08-16 16:15:05] GATEKEEPER: flow: The GK block failed to delete a key from hash table at gk_del_flow_entry_from_hash: No such file or directory for the flow with IP source address 58.77.92.151, and destination address 30.30.30.132”
While the log entry is different, they refer to the same problem: the flow table is being corrupted. The Gatekeeper without gk_del_flow_entry_at_pos()
detects the problem, but doesn't try to fix it. Whereas the version with gk_del_flow_entry_at_pos()
tries to fix the flow table. The flow table could be corrupted due to a bug, or hardware failure.
You say here that "there is no reload", but, in another post, you wrote "GK server restart". Is your server rebooting, that is, the Linux kernel is restarting? Or is only the Gatekeeper daemon being reloaded?
3、I simulate an attack on ddos (SYN+ACK) a C segment ip, the source address is also random, 30.30.30.x is a random destination address, in this scenario, the gk flow table exception is most likely to appear
This clarifies the log entries you posted before.
Since that randomizing source and destination addresses of the attack packets helps to trigger the problem, it suggests that the problem is related to an overpopulated flow table. Test varying the size of the flow table. What happens with half the size you have now? What about if you double the size you have now?
The size of your current flow table is "flow_ht_size = 250000000", correct? How much RAM does your Gatekeeper server have? How much of the RAM is allocated to huge pages?
If reload occurs in GK, kni will be re-initialized (this process cannot communicate with gk and gt servers), but after reload is completed, bird will be interrupted, and bird will not be started automatically. This exception is very serious. According to the explanation in #531, flow abnormally damages too many logs (when this log occurs, gk and gt can still be accessed normally), it will eventually cause gk to hang, and gatekeeper needs to be restarted to solve it.
Thank you for this information. Let's ignore bird
for now since this is a side-effect of the problem we are investigating here.
1、server configuration
CPU:Intel(R) Xeon(R) Platinum 8280M CPU @ 2.60GHz cpu cores 28 ,numa 2
RAM :DDR4 16G x 24 (384G) ECC Samsung M393A2G40DB0-CPB
2、The restart mentioned here is not the restart of the server hardware, but the restart of the gatekeeper application itself
When the gatekeeper flow table exception error occurs, /var/log/dmesg does not see any hardware exception message
3、GK Related flow table configuration information
local flow_ht_size = 250000000
local flow_table_scan_iter = 0
local scan_del_thresh = flow_ht_size * 0.1
local max_num_ipv4_rules = 1600000
local num_ipv4_tbl8s = 1000
local max_num_ipv6_rules = 200000
local num_ipv6_tbl8s = 65536
local max_num_ipv6_neighbors = 65536
Catch this error 5 minutes before the flow exception occurs
use command :dmesg -T -w
[Thu Aug 18 11:10:22 2022] perf: interrupt took too long (2501 > 2500), lowering kernel.perf_event_max_sample_rate to 79750
And when the flow exception occurs, the GK server pings the GT server, and there is packet loss.
The flow is abnormal, and the following error occurs before the gk server is automatically re-initialized [2022-08-18 11:41:41] GATEKEEPER: flow: gk_del_flow_entry_with_key(): the flow entry does not correspond to the flow key at postion 128997909; logging entry and releasing both flow keys...; while trying to show flow data, an unknown flow type 0 was found [2022-08-18 11:41:41] GATEKEEPER: flow: NOT in use [state: GK_REQUEST (0), flow_hash_value: 0x0, expire_at: 0x0, last_packet_seen_at: 0x0, last_priority: 0, allowance: 0, grantor_ip: NULL FIB entry]; while trying to show flow data, an unknown flow type 0 was found PANIC in rss_ip_flow_hf(): Unexpected protocol: 0
SOL local req_bw_rate = 0.5 ,With this configuration, when the GK request traffic is larger than it, there will be serious packet loss, and the association between flow and bpf cannot be obtained. Is this a reason for the abnormal flow table?
Hi @ShawnLeung87,
I think I've found the root cause of this issue. Could you test pull request #596? In case you need it, here is how you can pull a pull request. Another way to get that code is to download the diff, and apply it with the command patch -p1 < 596.diff
against your local copy of the new branch lts_v1.0
.
If everything works fine with pull request #596 during your tests, I will merge this pull request and release version v1.0.1 since this fix is important for production.
I'll be patching gatekeeper on Monday next week to continue testing the effect
64-byte small packet DDOS attack test report Test Pre_condition
1、DDOS syn+ack attack simulating random source 64-byte packets 2、Simulate 224 Class A addresses Routing fib table 3、hping3 -n -S -A -d 24 -p 80 --flood --rand-source --rand-dest -I eno2 30.30.30.x 4、GK server configuration CPU:Intel(R) Xeon(R) Platinum 8280M CPU @ 2.60GHz cpu cores 28 ,numa 2 RAM :DDR4 16G x 24 (384G) ECC Samsung M393A2G40DB0-CPB 5、Hupag default_hugepagesz=1G hugepagesz=1G hugepages=360 6、gatekeeper ,patch#596 scenarios 1
1、4 lcores gk 2、 gk.lua local mailbox_max_entries_exp = 14 local mailbox_mem_cache_size = 0 local mailbox_burst_size = 320 local log_ratelimit_interval_ms = 5000 local log_ratelimit_burst = 100 local max_pkt_burst_front = 320 local max_pkt_burst_back = 320
local flow_ht_size = 250000000
local flow_table_scan_iter = 0
local scan_del_thresh = flow_ht_size * 0.1
local max_num_ipv4_rules = 1600000
local num_ipv4_tbl8s = 1000
local max_num_ipv6_rules = 65536
local num_ipv6_tbl8s = 4096
local max_num_ipv6_neighbors = 65536
local basic_measurement_logging_ms = 60 * 1000 -- (1 minute)
3 、main_config.lua local n_gk_lcores = 2 * staticlib.count_numa_nodes(numa_table) local n_sol_lcores_per_socket = 2
4 、sol.lua
local req_bw_rate = 0.05
5、zabbix port traffic diagram
The test started at 10:54. After 11:38, the gatekeeper reinitialized and an abnormal log appeared.
6、error exception gatekeeper reinitialization,and the following error log appears:
[2022-08-22 11:38:25] GATEKEEPER: Flow (src: 59.109.120.5, dst: 30.30.30.193) at lcore 7: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-22 11:38:26] GATEKEEPER: flow: gk_del_flow_entry_with_key(): the flow entry does not correspond to the flow key at postion 92610535; logging entry and releasing both flow keys...; while trying to show flow data, an unknown flow type 0 was found
scenarios 2
1、16 lcores gk 2、gk.lua local mailbox_max_entries_exp = 14 local mailbox_mem_cache_size = 0 local mailbox_burst_size = 320 local log_ratelimit_interval_ms = 5000 local log_ratelimit_burst = 100 local max_pkt_burst_front = 320 local max_pkt_burst_back = 320
local flow_ht_size = 62500000
local flow_table_scan_iter = 0
local scan_del_thresh = flow_ht_size * 0.1
local max_num_ipv4_rules = 400000
local num_ipv4_tbl8s = 256
local max_num_ipv6_rules = 65536
local num_ipv6_tbl8s = 4096
local max_num_ipv6_neighbors = 65536
local basic_measurement_logging_ms = 60 * 1000 -- (1 minute)
local front_icmp_msgs_per_sec = 1000
local front_icmp_msgs_burst = 50
local back_icmp_msgs_per_sec = 1000
local back_icmp_msgs_burst = 50
3、main_config.lua local n_gk_lcores = 8 * staticlib.count_numa_nodes(numa_table) local n_sol_lcores_per_socket = 2
4、sol.lua
local req_bw_rate = 0.05
5、zabbix port traffic diagram
Start the test at 12:00, run for 5 hours, the gatekeeper does not appear to be re-initialized, and the abnormal log.It's still running
6、error exception Only the following logs appear, this log level belongs to NOTICE in gatekeeper. [2022-08-22 15:38:40] GATEKEEPER LLS: front interface should not be seeing a packet with EtherType 0x0800 [2022-08-22 15:38:40] GATEKEEPER LLS: front interface should not be seeing a packet with EtherType 0x0800 [2022-08-22 15:38:40] GATEKEEPER LLS: front interface should not be seeing a packet with EtherType 0x0800 [2022-08-22 15:38:40] GATEKEEPER LLS: front interface should not be seeing a packet with EtherType 0x0800 [2022-08-22 15:38:40] GATEKEEPER LLS: front interface should not be seeing a packet with EtherType 0x0800
[2022-08-22 15:38:40] GATEKEEPER LLS: front interface should not be seeing a packet with EtherType 0x0800 [2022-08-22 15:38:40] GATEKEEPER LLS: front interface should not be seeing a packet with EtherType 0x0800 [2022-08-22 15:38:40] GATEKEEPER LLS: front interface should not be seeing a packet with EtherType 0x0800 [2022-08-22 15:38:40] GATEKEEPER LLS: front interface should not be seeing a packet with EtherType 0x0800 [2022-08-22 15:38:40] GATEKEEPER LLS: front interface should not be seeing a packet with EtherType 0x0800 Summary of test results Does this test indicate that in the case of a 64-byte packet attack scenario, the flow_ht_size of each lcore cannot be too large.
The following log entries from scenario 1 show that there was an issue while applying the code of pull request #596:
[2022-08-22 11:38:25] GATEKEEPER: Flow (src: 59.109.120.5, dst: 30.30.30.193) at lcore 7: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow...
[2022-08-22 11:38:26] GATEKEEPER: flow: gk_del_flow_entry_with_key(): the flow entry does not correspond to the flow key at postion 92610535; logging entry and releasing both flow keys...; while trying to show flow data, an unknown flow type 0 was found
As you check in file lib/flow.c
of pull request #596, there should be no log entries terminating with "; while trying to show flow data, an unknown flow type %hu was found" since they are replaced with a call to the new function print_invalid_flow_err_msg()
that includes more information. Perhaps, the pull request was applied correctly, but the older binary was somehow used during the tests.
Side notes:
-
The GK parameters
local max_pkt_burst_front = 320
andlocal max_pkt_burst_back = 320
used in both scenarios may add latency in production. I advise testing the need for this change. If your tests do not confirm significant gains, I'd return to the default values. -
The parameter
main_config.lua:local n_sol_lcores_per_socket = 2
is likely just wasting resources in scenario 1. -
While log entries like
[2022-08-22 15:38:40] GATEKEEPER LLS: front interface should not be seeing a packet with EtherType 0x0800
are just notices and not related to the problem of this thread, later, you may want to investigate what is triggering it in your environment.
The following log entries from scenario 1 show that there was an issue while applying the code of pull request #596:
[2022-08-22 11:38:25] GATEKEEPER: Flow (src: 59.109.120.5, dst: 30.30.30.193) at lcore 7: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-22 11:38:26] GATEKEEPER: flow: gk_del_flow_entry_with_key(): the flow entry does not correspond to the flow key at postion 92610535; logging entry and releasing both flow keys...; while trying to show flow data, an unknown flow type 0 was found
As you check in file
lib/flow.c
of pull request #596, there should be no log entries terminating with "; while trying to show flow data, an unknown flow type %hu was found" since they are replaced with a call to the new functionprint_invalid_flow_err_msg()
that includes more information. Perhaps, the pull request was applied correctly, but the older binary was somehow used during the tests.
When I recompile I get the same warning message
The following log entries from scenario 1 show that there was an issue while applying the code of pull request #596:
[2022-08-22 11:38:25] GATEKEEPER: Flow (src: 59.109.120.5, dst: 30.30.30.193) at lcore 7: gk_del_flow_entry_at_pos(): flow was not indexed; logging and dropping flow... [2022-08-22 11:38:26] GATEKEEPER: flow: gk_del_flow_entry_with_key(): the flow entry does not correspond to the flow key at postion 92610535; logging entry and releasing both flow keys...; while trying to show flow data, an unknown flow type 0 was found
As you check in file
lib/flow.c
of pull request #596, there should be no log entries terminating with "; while trying to show flow data, an unknown flow type %hu was found" since they are replaced with a call to the new functionprint_invalid_flow_err_msg()
that includes more information. Perhaps, the pull request was applied correctly, but the older binary was somehow used during the tests.When I recompile I get the same warning message
This is because of the problem of pulling in the diff mode. I manually pulled the code again and it can be used. I'll test it on a 4 lcores gk scenario and see if it works
The improvement of the code that heals the flow table has been forward ported to v1.1 and merged (see pull request #601).
Moreover, I've investigated the root cause of the corruption of the flow table, and it seems to be a bug in the custom DPDK used in Gatekeeper v1.0. Although I have not been able to pinpoint exactly where this bug is. Testing v1.1 with an environment similar to the one you used, there was no corruption of the flow table.
It'd be helpful if you could test v1.1 in your test environment. Notice that you can reinstall only your Gatekeeper server with version 1.1 and keep your Grantor servers at version 1.0; this should simplify your test.
I have updated GK to the latest version 1.1 and GT using version 1.0 according to the above requirements, but now the data cannot be requested normally, the server port can be telnet and ping, and there is a problem with data transmission.
1、For example remote ssh, tcpdump sees data not received by server
2、flow bpf status
[2022-10-15 18:44:19] GK/6: Flow (src: 20.20.20.20, dst: 30.30.30.30) at index 1: [state: GK_BPF (3), flow_hash_value: 0xd91eebd5, expire_at: 0x279fe7c5d632, program_index=0, cookie=03c68f3eed260000, 0004000000000000, 16fa0f0000000000, cc8734ab7b270000, 6801d4cf01000000, 0000000000000000, 0000000000000000, 0000000000000000, grantor_ip: 10.0.2.11]
3、Remove FWD_GRANTOR, it can be accessed normally