High CPU usage when running Katran in shared mode with bonding interface
Hi everyone!
I am currently running Katran as a L3 Director load balancer for our services. I would like to run Katran with a bonding interface because I believe it's easier to add more network interfaces rather than servers for scaling Katran's workload. I followed this issue (https://github.com/facebookincubator/katran/issues/13) and make the Katran work normally in shared mode and bonding interface with those command:
# Network config
1: lo: ...
2: ens2f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 xdp/id:1900 qdisc mq master bond0 state UP group default qlen 1000
link/ether 02:96:77:09:2b:73 brd ff:ff:ff:ff:ff:ff permaddr d4:f5:ef:36:1a:60
altname enp55s0f0
3: ens2f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 xdp/id:1905 qdisc mq master bond0 state UP group default qlen 1000
link/ether 02:96:77:09:2b:73 brd ff:ff:ff:ff:ff:ff permaddr d4:f5:ef:36:1a:68
altname enp55s0f1
4: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
5: ipip0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
6: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN group default qlen 1000
link/tunnel6 :: brd :: permaddr 16dd:8fa:927c::
7: ipip60@NONE: <NOARP,UP,LOWER_UP> mtu 1452 qdisc noqueue state UNKNOWN group default qlen 1000
link/tunnel6 :: brd :: permaddr 42d6:82ed:7cf5::
inet6 fe80::40d6:82ff:feed:7cf5/64 scope link
valid_lft forever preferred_lft forever
8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 02:96:77:09:2b:73 brd ff:ff:ff:ff:ff:ff
inet 10.50.73.53/24 brd 10.50.73.255 scope global bond0
valid_lft forever preferred_lft forever
inet6 fe80::96:77ff:fe09:2b73/64 scope link
valid_lft forever preferred_lft forever
## for xdp root I edited the install_xdproot.sh script.
## And I run just one command to add Katran loadbalancer xdp program
sudo ./build/example_grpc/katran_server_grpc -balancer_prog ./deps/bpfprog/bpf/balancer.bpf.o -default_mac 58:e4:34:56:46:e0 -forwarding_cores=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 -numa_nodes=0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1 -healthchecker_prog ./deps/bpfprog/bpf/healthchecking_ipip.o -intf=ens2f0 -ipip_intf=ipip0 -ipip6_intf=ipip60 -lru_size=100000 -map_path /sys/fs/bpf/jmp_ens2 -prog_pos=2
## Katran VIP + REAL config
2024/09/08 04:23:39 vips len 1
VIP: 49.213.85.151 Port: 80 Protocol: tcp
Vip's flags:
->49.213.85.171 weight: 1 flags:
exiting
- I do mapping irq with cpu:
i40e-ens2f0-TxRx-0(45) is affinitive with 00,00000001 (from CPU1 .....)
i40e-ens2f0-TxRx-1(46) is affinitive with 00,00000002
i40e-ens2f0-TxRx-2(47) is affinitive with 00,00000004
i40e-ens2f0-TxRx-3(48) is affinitive with 00,00000008
i40e-ens2f0-TxRx-4(49) is affinitive with 00,00000010
i40e-ens2f0-TxRx-5(50) is affinitive with 00,00000020
i40e-ens2f0-TxRx-6(51) is affinitive with 00,00000040
i40e-ens2f0-TxRx-7(52) is affinitive with 00,00000080
i40e-ens2f0-TxRx-8(53) is affinitive with 00,00000100
i40e-ens2f0-TxRx-9(54) is affinitive with 00,00000200
i40e-ens2f1-TxRx-0(95) is affinitive with 00,00000400
i40e-ens2f1-TxRx-1(96) is affinitive with 00,00000800
i40e-ens2f1-TxRx-2(97) is affinitive with 00,00001000
i40e-ens2f1-TxRx-3(98) is affinitive with 00,00002000
i40e-ens2f1-TxRx-4(99) is affinitive with 00,00004000
i40e-ens2f1-TxRx-5(100) is affinitive with 00,00008000
i40e-ens2f1-TxRx-6(101) is affinitive with 00,00010000
i40e-ens2f1-TxRx-7(102) is affinitive with 00,00020000
i40e-ens2f1-TxRx-8(103) is affinitive with 00,00040000
i40e-ens2f1-TxRx-9(104) is affinitive with 00,00080000 (to CPU 20)
- The problem arose when I saw the Katran's statics, it's shown 100% lru miss rate
##katran_goclient -s -lru
summary: 6380747 pkts/sec. lru hit: 0.00% lru miss: 100.00% (tcp syn: 1.00% tcp non-syn: 0.00% udp: 0.00%) fallback lru hit: 0 pkts/sec
summary: 6668858 pkts/sec. lru hit: -0.00% lru miss: 100.00% (tcp syn: 1.00% tcp non-syn: 0.00% udp: 0.00%) fallback lru hit: 0 pkts/sec
summary: 6657124 pkts/sec. lru hit: 0.00% lru miss: 100.00% (tcp syn: 1.00% tcp non-syn: 0.00% udp: -0.00%) fallback lru hit: 0 pkts/sec
-
And all 20 CPUs are consumed by
ksoftirqd -
Here is the screenshot that shown the output of
perf report:
I am not sure this performance issue is relating to Katran or not. So, I post this question here to find some clues.
Feel free to ask me to provide more information!
I updated one more test case from my research. I want to know whether bonding interface cause the overload of CPU usage, so, I removed the bonding interface and ran Katran shared mode directly in two physical interfaces.
- Here is my network config
1: lo: ....
2: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp/id:2118 qdisc mq state UP group default qlen 1000
link/ether d4:f5:ef:ac:ac:f0 brd ff:ff:ff:ff:ff:ff
altname enp18s0f0
inet 10.50.73.55/24 brd 10.50.73.255 scope global ens1f0
valid_lft forever preferred_lft forever
inet6 fe80::d6f5:efff:feac:acf0/64 scope link
valid_lft forever preferred_lft forever
3: ens1f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp/id:2123 qdisc mq state UP group default qlen 1000
link/ether d4:f5:ef:ac:ac:f8 brd ff:ff:ff:ff:ff:ff
altname enp18s0f1
inet 10.50.73.52/24 brd 10.50.73.255 scope global ens1f1
valid_lft forever preferred_lft forever
inet6 fe80::d6f5:efff:feac:acf8/64 scope link
valid_lft forever preferred_lft forever
....
8: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
9: ipip0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
10: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN group default qlen 1000
link/tunnel6 :: brd :: permaddr f27f:220f:4e91::
11: ipip60@NONE: <NOARP,UP,LOWER_UP> mtu 1452 qdisc noqueue state UNKNOWN group default qlen 1000
link/tunnel6 :: brd :: permaddr baba:33a1:a4e6::
inet6 fe80::b8ba:33ff:fea1:a4e6/64 scope link
valid_lft forever preferred_lft forever
- Command that I used to run Katran
sudo ./build/example_grpc/katran_server_grpc -balancer_prog ./deps/bpfprog/bpf/balancer.bpf.o -default_mac 58:e4:34:56:46:e0 -forwarding_cores=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 -numa_nodes=0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1 -healthchecker_prog ./deps/bpfprog/bpf/healthchecking_ipip.o -intf=ens1f0 -ipip_intf=ipip0 -ipip6_intf=ipip60 -lru_size=1000000 -map_path /sys/fs/bpf/jmp_ens1 -prog_pos=2
- Here is CPU mapping information
i40e-ens1f0-TxRx-0(44) is affinitive with 00,00000001
i40e-ens1f0-TxRx-1(45) is affinitive with 00,00000002
i40e-ens1f0-TxRx-2(46) is affinitive with 00,00000004
i40e-ens1f0-TxRx-3(47) is affinitive with 00,00000008
i40e-ens1f0-TxRx-4(48) is affinitive with 00,00000010
i40e-ens1f0-TxRx-5(49) is affinitive with 00,00000020
i40e-ens1f0-TxRx-6(50) is affinitive with 00,00000040
i40e-ens1f0-TxRx-7(51) is affinitive with 00,00000080
i40e-ens1f0-TxRx-8(52) is affinitive with 00,00000100
i40e-ens1f0-TxRx-9(53) is affinitive with 00,00000200
i40e-ens1f1-TxRx-0(103) is affinitive with 00,00000400
i40e-ens1f1-TxRx-1(104) is affinitive with 00,00000800
i40e-ens1f1-TxRx-2(105) is affinitive with 00,00001000
i40e-ens1f1-TxRx-3(106) is affinitive with 00,00002000
i40e-ens1f1-TxRx-4(107) is affinitive with 00,00004000
i40e-ens1f1-TxRx-5(108) is affinitive with 00,00008000
i40e-ens1f1-TxRx-6(109) is affinitive with 00,00010000
i40e-ens1f1-TxRx-7(110) is affinitive with 00,00020000
i40e-ens1f1-TxRx-8(111) is affinitive with 00,00040000
i40e-ens1f1-TxRx-9(112) is affinitive with 00,00080000
- Katran's stats:
- CPU Usage is also full
- Here is the output from
perfcommand - watching with
perf top - Recording and
perf report
There is a little performance improvement when running with physical interfaces (according to the output from Katran), but the CPU usage is still full.
Hi @avasylev @tehnerd,
Could you guys share some thoughts on my setup? I still struggle with this.
sudo sysctl -a | grep bpf ?
Hi @tehnerd,
Here is the output:
Hmm. Strange. Please collect 1 perf record -a -F 23 -- sleep 10 2 same perf as before (in your previous pictures; I guess you were using -g as well). When looking into report move to balancers bpf program and use 'a' shortcut. That would show assembly code so we would understand where exactly in bpf program we consume cpu
Also how the traffic pattern looks like ? Are they real tcp streams or just random packets
Please collect 1 perf record -a -F 23 -- sleep 10 2 same perf as before (in your previous pictures; I guess you were using -g as well)
-
I have a little trouble getting assembly code from
perf report. -
With bpf program, it shows an error
-
With other processes, I used
a shortcutand it returned the output like this one -
Anyway, I share all the output that I collected from those commands.
-
perf record -a -F 23 -- sleep 10
- Assembly code when I jump (press enter) deeper-dive in katran's bpf program
-
perf record -ag -- sleep 20
-
perf top --sort comm,dso
Again, it shows an error when I press a shortcut to the bpf program
Also how the traffic pattern looks like ? Are they real tcp streams or just random packets
-
I used Pktgen to generate traffic, here is the configuration:
-
In short, I want to stimulate the syn flood packet and send it to Katran. I used xdpdump tool to capture the packet and the packets look like this:
06:24:05.724378 IP 49.213.85.169.14397 > 49.213.85.152.80: Flags [S], seq 74616:74622, win 8192, length 6: HTTP
06:24:05.724396 IP 49.213.85.169.14997 > 49.213.85.152.80: Flags [S], seq 74616:74622, win 8192, length 6: HTTP
06:24:05.724401 IP 49.213.85.169.14984 > 49.213.85.152.80: Flags [S], seq 74616:74622, win 8192, length 6: HTTP
- Here is the Katran configuration
katran_goclient -l -server localhost:8080
2024/09/26 06:26:30 vips len 2
VIP: 49.213.85.153 Port: 80 Protocol: tcp
Vip's flags:
->49.213.85.171 weight: 1 flags:
VIP: 49.213.85.152 Port: 80 Protocol: tcp
Vip's flags:
->49.213.85.171 weight: 1 flags:
exiting
UPDATE:
- When trying to figure out the cause, I stopped at this thread which describes the NIC driver
i40edrop packet at a rate 10Mpps: https://www.spinics.net/lists/xdp-newbies/msg01918.html#google_vignette - mine server have a similar NIC driver:
driver: i40e
version: 5.15.0-122-generic
firmware-version: 10.53.7
expansion-rom-version:
bus-info: 0000:37:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
- And I saw a lot of package drops at the rx flow
- I think this is the important information that may show Katran is not causing high CPU usage.
Feels like bpf program is not jitted. Could you please run bpftool prog list and bpftool map list
Also in perf report (which was taken with -ag) Please show the output filtered with bpf keyword (/ and than type bpf)
Yes sure, here is the output of those commands bpftool prog list
bpftool map list
perf report with the output filtered with bpf keyword
UPDATE Here is the interface tag, it seems like the bpf program is jitted, at least from the outside view, hope this information is helpful.
Hi @tehnerd,
Would you happen to have any updates on this issue? Feel free to ask me to provide more information or do some tests!
No idea. For some reason bpf program seems slow in bpf code itself. At this point the only idea is to build perf with bpf support (link against that library which is required to disassembly bpf) and to check where that cpus is spent inside bpf.
You mention that it feels like the bpf program is not jitted. But, according to your answer, it does not seem to be the case here, right? So, the next step is building perf with bpf support and checking inside the bpf program.
Hi @tehnerd,
It has been a while and I finally got the assembly code output inside the Katran bpf loadbalance program.
Command: perf record -a -F 23 -- sleep 10
- the point in the assembly code that shows red sign
Command: perf record -ag -- sleep 20
- the point in the assembly code that shows red sign
I attached the content of the perf report --stdio as a zip file here, in case it helps
perf.zip
Could you please take a look at that?
All of that is memory accesses. It feels like there is some issue with it. Either slow memory or system is low on memory. Or tlb is trashed. What the environment looks like ? Is it vm or not? What the memory? How much ? How much free? Would be nice to see perf counters for memory accesses (stalled front end backend cycles. Tlb stats)
Here is the memory information from my server.
Is it vm or not? What the memory? How much ? How much free?
-
the server is physical not a VM
-
It has 64GB of memory and there is no significant increase in the amount of used memory when the traffic arrived, maybe the user-space tool can not catch that.
-
Here is the numa node information
-
the perf counters for memory accesses are here, hope that it all the counters that you need
-
UPDATE: Perf counter CPU cycle
Hi @tehnerd,
Did you find any clues from the memory stats? Feel free to ask me to provide more information!
Can you run perf to collect counters for "cycles,stalled-cycles-fronted,stalled-cycles-backend". The only explanation that make sense for mov to / From memory to be high on cpu is for some reason to slow memory access. Which would be indicated by high value of stalled cycles
It's quite a challenge to collect those counters because my CPU model is a Cascade Lake microarchitecture, and it seems that stalled-cycles-fronted,stalled-cycles-backend counters are not supported.
Model name: Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz
Model number: 85
The Intel Xeon Silver 4210R belongs to the Cascade Lake
I look into kernel event code: arch/x86/events/intel/core.c kernel version 5.15 https://mirrors.edge.kernel.org/pub/linux/kernel/v5.x/linux-5.15.tar.gz and captured the counters by events mask, so the outputs are here:
- Perf in the normal scenario
- Perf in high network workload (syn flood)
I found the IPC indicator which is useful to identify CPU stalled. (this blog is beneficial: https://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html) Here is the IPC measurement from my server:
The value is around 0.18 and 0.23 which is quite low and the command Interpretation from the blog shows that the low IPC value is related to low memory I/O:
If your IPC is < 1.0, you are likely memory stalled, and software tuning strategies include reducing memory I/O, and improving CPU caching and memory locality, especially on NUMA systems. Hardware tuning includes using processors with larger CPU caches, and faster memory, busses, and interconnects.
I suppose that focusing on memory tuning can increase performance but I have no clues on that yet. Could you show me some configurations that help optimize the memory performance?
Yeah. Stalls are way too hard. And they are not even on map accesses. Anyway I think there is some issue with hardware? Do you have any other test machine to run ?
Actually, I have another test machine but the hardware specs are mostly the same except the network driver.
The server in this discussion uses the network driver ixgbe.
I have another one with driver i40e and this is some performance output in this machine
The CPU util is still full
From your perspective, which is the hardware spec you want to change in this case?
That seems pretty strange. I do not think that it is related to NIC. Seems like some strange memory related hardware issue/ specific. I will post later today how to run it, but wonder what are the results of synthetic load tests would looks like. They would test just bpf code itself. From Perf it looked like that the issue was visible inside it.
Hi @tehnerd,
I am still waiting for your instructions here.
I will try to reply tomorrow
So you would need to build katran's bpf program with INLINE_DECAP define set https://github.com/facebookincubator/katran/blob/main/katran/lib/bpf/balancer_consts.h#L352
Than you would need to run katran_tester with flag perf_testing=true https://github.com/facebookincubator/katran/blob/main/katran/lib/testing/katran_tester.cpp#L57
As example of how to run tester (just make sure to remove test from fixtures flag) https://github.com/facebookincubator/katran/blob/main/os_run_tester.sh#L31
Output would looks like something like this
I1106 19:24:26.794119 153197 KatranLb.cpp:1112] modifying vip: fc00:1::2:443:17
I1106 19:24:26.978382 153197 BpfTester.cpp:306] Test: packet to UDP based v4 VIP (and v4 real) duration: 119 ns/pckt or 8403361 pps
I1106 19:24:27.091374 153197 BpfTester.cpp:306] Test: packet to TCP based v4 VIP (and v4 real) duration: 101 ns/pckt or 9900990 pps
I1106 19:24:27.206372 153197 BpfTester.cpp:306] Test: packet to TCP based v4 VIP (and v4 real) + ToS in IPV4 duration: 101 ns/pckt or 9900990 pps
I1106 19:24:27.331370 153197 BpfTester.cpp:306] Test: packet to TCP based v4 VIP (and v4 real; any dst ports). duration: 112 ns/pckt or 8928571 pps
I1106 19:24:27.445369 153197 BpfTester.cpp:306] Test: packet to TCP based v4 VIP (and v6 real) duration: 102 ns/pckt or 9803921 pps
I1106 19:24:27.559369 153197 BpfTester.cpp:306] Test: packet to TCP based v6 VIP (and v6 real) duration: 102 ns/pckt or 9803921 pps
I1106 19:24:27.674368 153197 BpfTester.cpp:306] Test: packet to TCP based v6 VIP (and v6 real) with ToS / tc set duration: 102 ns/pckt or 9803921 pps
I1106 19:24:27.696365 153197 BpfTester.cpp:306] Test: v4 ICMP echo-request duration: 9 ns/pckt or 111111111 pps
I1106 19:24:27.718365 153197 BpfTester.cpp:306] Test: v6 ICMP echo-request duration: 9 ns/pckt or 111111111 pps
I1106 19:24:27.867385 153197 BpfTester.cpp:306] Test: v4 ICMP dest-unreachabe fragmentation-needed duration: 136 ns/pckt or 7352941 pps
I1106 19:24:28.025378 153197 BpfTester.cpp:306] Test: v6 ICMP packet-too-big duration: 146 ns/pckt or 6849315 pps
I1106 19:24:28.049368 153197 BpfTester.cpp:306] Test: drop of IPv4 packet w/ options duration: 8 ns/pckt or 125000000 pps
I1106 19:24:28.079370 153197 BpfTester.cpp:306] Test: drop of IPv4 fragmented packet duration: 9 ns/pckt or 111111111 pps
I1106 19:24:28.106365 153197 BpfTester.cpp:306] Test: drop of IPv6 fragmented packet duration: 8 ns/pckt or 125000000 pps
I1106 19:24:28.157368 153197 BpfTester.cpp:306] Test: pass of v4 packet with dst not equal to any configured VIP duration: 39 ns/pckt or 25641025 pps
I1106 19:24:28.206367 153197 BpfTester.cpp:306] Test: pass of v6 packet with dst not equal to any configured VIP duration: 37 ns/pckt or 27027027 pps
I1106 19:24:28.225365 153197 BpfTester.cpp:306] Test: pass of arp packet duration: 7 ns/pckt or 142857142 pps
I1106 19:24:28.336370 153197 BpfTester.cpp:306] Test: LRU hit duration: 99 ns/pckt or 10101010 pps
I1106 19:24:28.504374 153197 BpfTester.cpp:306] Test: packet #1 dst port hashing only duration: 156 ns/pckt or 6410256 pps
I1106 19:24:28.671370 153197 BpfTester.cpp:306] Test: packet #2 dst port hashing only duration: 155 ns/pckt or 6451612 pps
I1106 19:24:28.799374 153197 BpfTester.cpp:306] Test: ipinip packet duration: 116 ns/pckt or 8620689 pps
I1106 19:24:28.910370 153197 BpfTester.cpp:306] Test: ipv6inipv6 packet duration: 99 ns/pckt or 10101010 pps
I1106 19:24:29.040375 153197 BpfTester.cpp:306] Test: ipv4inipv6 packet duration: 118 ns/pckt or 8474576 pps
I1106 19:24:29.180369 153197 BpfTester.cpp:306] Test: QUIC: long header. Client Initial type. LRU miss duration: 123 ns/pckt or 8130081 pps
I1106 19:24:29.318370 153197 BpfTester.cpp:306] Test: QUIC: long header. 0-RTT Protected. CH. LRU hit. duration: 124 ns/pckt or 8064516 pps
I1106 19:24:29.440368 153197 BpfTester.cpp:306] Test: QUIC: long header. Handshake. v4 vip v6 real. Conn Id V1 based. server id is 1024 mapped to fc00::1. duration: 105 ns/pckt or 9523809 pps
I1106 19:24:29.558368 153197 BpfTester.cpp:306] Test: QUIC: long header. Retry. v4 vip v6 real. Conn Id V1 based. server id is 1024 mapped to fc00::1. duration: 105 ns/pckt or 9523809 pps
I1106 19:24:29.702373 153197 BpfTester.cpp:306] Test: QUIC: long header. client initial. v6 vip v6 real. LRU miss duration: 129 ns/pckt or 7751937 pps
I1106 19:24:29.839370 153197 BpfTester.cpp:306] Test: QUIC: short header. No connection id. LRU hit duration: 124 ns/pckt or 8064516 pps
I1106 19:24:29.956370 153197 BpfTester.cpp:306] Test: QUIC: short header w/ connection id duration: 105 ns/pckt or 9523809 pps
I1106 19:24:30.096371 153197 BpfTester.cpp:306] Test: QUIC: short header w/ connection id 1092 but non-existing mapping. LRU hit duration: 128 ns/pckt or 7812500 pps
I1106 19:24:30.232370 153197 BpfTester.cpp:306] Test: QUIC: short header w/ conn id. host id = 0. LRU hit duration: 124 ns/pckt or 8064516 pps
I1106 19:24:30.270366 153197 BpfTester.cpp:306] Test: UDP: big packet of length 1515. trigger PACKET TOOBIG duration: 26 ns/pckt or 38461538 pps
I1106 19:24:30.387372 153197 BpfTester.cpp:306] Test: QUIC: short header w/ connection id. CIDv2 duration: 105 ns/pckt or 9523809 pps
I1106 19:24:30.531375 153197 BpfTester.cpp:306] Test: QUIC: short header w/ connection id 197700 but non-existing mapping. CIDv2. LRU hit. duration: 128 ns/pckt or 7812500 pps
This code only tests bpf sub system itself. That's why numbers are so high (so no code which handles packets / NIC and cache hits are 100%). I just want to check how this numbers looks like on your system. Because from the perf it seems like that bpf is slow for some reason there
Thanks for your guidance! I will share with you the output soon.
Hi @tehnerd,
Here is the perf_testing output:
I1110 06:13:34.652143 2084700 KatranLb.cpp:983] adding new vip: 10.200.1.1:80:17
I1110 06:13:34.724103 2084700 KatranLb.cpp:983] adding new vip: 10.200.1.1:80:6
I1110 06:13:34.793542 2084700 KatranLb.cpp:983] adding new vip: 10.200.1.2:0:6
I1110 06:13:34.862829 2084700 KatranLb.cpp:983] adding new vip: 10.200.1.4:0:6
I1110 06:13:34.929800 2084700 KatranLb.cpp:1112] modifying vip: 10.200.1.4:0:6
I1110 06:13:34.929811 2084700 KatranLb.cpp:983] adding new vip: 10.200.1.3:80:6
I1110 06:13:34.996840 2084700 KatranLb.cpp:983] adding new vip: fc00:1::1:80:6
I1110 06:13:35.063572 2084700 KatranLb.cpp:983] adding new vip: 10.200.1.5:443:17
I1110 06:13:35.063591 2084700 KatranLb.cpp:1112] modifying vip: 10.200.1.5:443:17
I1110 06:13:35.130414 2084700 KatranLb.cpp:983] adding new vip: fc00:1::2:443:17
I1110 06:13:35.130434 2084700 KatranLb.cpp:1112] modifying vip: fc00:1::2:443:17
E1110 06:13:35.197413 2084700 KatranLb.cpp:1602] source based routing is not enabled in forwarding plane
E1110 06:13:35.197420 2084700 KatranLb.cpp:1602] source based routing is not enabled in forwarding plane
E1110 06:13:35.197422 2084700 KatranLb.cpp:1602] source based routing is not enabled in forwarding plane
E1110 06:13:35.197424 2084700 KatranLb.cpp:1602] source based routing is not enabled in forwarding plane
E1110 06:13:35.197427 2084700 KatranLb.cpp:1602] source based routing is not enabled in forwarding plane
E1110 06:13:35.197428 2084700 KatranLb.cpp:1602] source based routing is not enabled in forwarding plane
I1110 06:13:35.210022 2084700 BpfTester.cpp:306] Test: packet to UDP based v4 VIP (and v4 real) duration: 12 ns/pckt or 83333333 pps
I1110 06:13:35.223362 2084700 BpfTester.cpp:306] Test: packet to TCP based v4 VIP (and v4 real) duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.236763 2084700 BpfTester.cpp:306] Test: packet to TCP based v4 VIP (and v4 real) + ToS in IPV4 duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.250176 2084700 BpfTester.cpp:306] Test: packet to TCP based v4 VIP (and v4 real; any dst ports). duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.263258 2084700 BpfTester.cpp:306] Test: packet to TCP based v4 VIP (and v6 real) duration: 12 ns/pckt or 83333333 pps
I1110 06:13:35.276157 2084700 BpfTester.cpp:306] Test: packet to TCP based v6 VIP (and v6 real) duration: 12 ns/pckt or 83333333 pps
I1110 06:13:35.289229 2084700 BpfTester.cpp:306] Test: packet to TCP based v6 VIP (and v6 real) with ToS / tc set duration: 12 ns/pckt or 83333333 pps
I1110 06:13:35.302788 2084700 BpfTester.cpp:306] Test: v4 ICMP echo-request duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.315153 2084700 BpfTester.cpp:306] Test: v6 ICMP echo-request duration: 12 ns/pckt or 83333333 pps
I1110 06:13:35.328374 2084700 BpfTester.cpp:306] Test: v4 ICMP dest-unreachabe fragmentation-needed duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.341766 2084700 BpfTester.cpp:306] Test: v6 ICMP packet-too-big duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.352962 2084700 BpfTester.cpp:306] Test: drop of IPv4 packet w/ options duration: 11 ns/pckt or 90909090 pps
I1110 06:13:35.365142 2084700 BpfTester.cpp:306] Test: drop of IPv4 fragmented packet duration: 12 ns/pckt or 83333333 pps
I1110 06:13:35.375741 2084700 BpfTester.cpp:306] Test: drop of IPv6 fragmented packet duration: 10 ns/pckt or 100000000 pps
I1110 06:13:35.419065 2084700 BpfTester.cpp:306] Test: pass of v4 packet with dst not equal to any configured VIP duration: 43 ns/pckt or 23255813 pps
I1110 06:13:35.463760 2084700 BpfTester.cpp:306] Test: pass of v6 packet with dst not equal to any configured VIP duration: 44 ns/pckt or 22727272 pps
I1110 06:13:35.472584 2084700 BpfTester.cpp:306] Test: pass of arp packet duration: 8 ns/pckt or 125000000 pps
I1110 06:13:35.485366 2084700 BpfTester.cpp:306] Test: LRU hit duration: 12 ns/pckt or 83333333 pps
I1110 06:13:35.498824 2084700 BpfTester.cpp:306] Test: packet #1 dst port hashing only duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.512166 2084700 BpfTester.cpp:306] Test: packet #2 dst port hashing only duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.525400 2084700 BpfTester.cpp:306] Test: ipinip packet duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.538502 2084700 BpfTester.cpp:306] Test: ipv6inipv6 packet duration: 12 ns/pckt or 83333333 pps
I1110 06:13:35.551779 2084700 BpfTester.cpp:306] Test: ipv4inipv6 packet duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.565048 2084700 BpfTester.cpp:306] Test: QUIC: long header. Client Initial type. LRU miss duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.578521 2084700 BpfTester.cpp:306] Test: QUIC: long header. 0-RTT Protected. CH. LRU hit. duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.591617 2084700 BpfTester.cpp:306] Test: QUIC: long header. Handshake. v4 vip v6 real. Conn Id V1 based. server id is 1024 mapped to fc00::1. duration: 12 ns/pckt or 83333333 pps
I1110 06:13:35.604856 2084700 BpfTester.cpp:306] Test: QUIC: long header. Retry. v4 vip v6 real. Conn Id V1 based. server id is 1024 mapped to fc00::1. duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.617874 2084700 BpfTester.cpp:306] Test: QUIC: long header. client initial. v6 vip v6 real. LRU miss duration: 12 ns/pckt or 83333333 pps
I1110 06:13:35.630934 2084700 BpfTester.cpp:306] Test: QUIC: short header. No connection id. LRU hit duration: 12 ns/pckt or 83333333 pps
I1110 06:13:35.644023 2084700 BpfTester.cpp:306] Test: QUIC: short header w/ connection id duration: 12 ns/pckt or 83333333 pps
I1110 06:13:35.657516 2084700 BpfTester.cpp:306] Test: QUIC: short header w/ connection id 1092 but non-existing mapping. LRU hit duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.670744 2084700 BpfTester.cpp:306] Test: QUIC: short header w/ conn id. host id = 0. LRU hit duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.707095 2084700 BpfTester.cpp:306] Test: UDP: big packet of length 1515. trigger PACKET TOOBIG duration: 36 ns/pckt or 27777777 pps
I1110 06:13:35.720397 2084700 BpfTester.cpp:306] Test: QUIC: short header w/ connection id. CIDv2 duration: 13 ns/pckt or 76923076 pps
I1110 06:13:35.733649 2084700 BpfTester.cpp:306] Test: QUIC: short header w/ connection id 197700 but non-existing mapping. CIDv2. LRU hit. duration: 13 ns/pckt or 76923076 pps
When I run the script with the flag -test_from_fixtures, many failure cases appear. I added the output here, maybe it's helpful
I1110 06:19:17.944221 2092992 KatranLb.cpp:983] adding new vip: 10.200.1.1:80:17
I1110 06:19:18.016070 2092992 KatranLb.cpp:983] adding new vip: 10.200.1.1:80:6
I1110 06:19:18.087566 2092992 KatranLb.cpp:983] adding new vip: 10.200.1.2:0:6
I1110 06:19:18.157521 2092992 KatranLb.cpp:983] adding new vip: 10.200.1.4:0:6
I1110 06:19:18.227555 2092992 KatranLb.cpp:1112] modifying vip: 10.200.1.4:0:6
I1110 06:19:18.227571 2092992 KatranLb.cpp:983] adding new vip: 10.200.1.3:80:6
I1110 06:19:18.296461 2092992 KatranLb.cpp:983] adding new vip: fc00:1::1:80:6
I1110 06:19:18.364773 2092992 KatranLb.cpp:983] adding new vip: 10.200.1.5:443:17
I1110 06:19:18.364811 2092992 KatranLb.cpp:1112] modifying vip: 10.200.1.5:443:17
I1110 06:19:18.434492 2092992 KatranLb.cpp:983] adding new vip: fc00:1::2:443:17
I1110 06:19:18.434535 2092992 KatranLb.cpp:1112] modifying vip: fc00:1::2:443:17
I1110 06:19:18.502872 2092992 KatranLb.cpp:983] adding new vip: 10.200.1.99:80:6
I1110 06:19:18.502897 2092992 KatranLb.cpp:983] adding new vip: fc00:1::11:80:17
I1110 06:19:18.503070 2092992 BpfTester.cpp:258] Test: packet to UDP based v4 VIP (and v4 real) result: Failed
I1110 06:19:18.503154 2092992 BpfTester.cpp:258] Test: packet to TCP based v4 VIP (and v4 real) result: Failed
I1110 06:19:18.503226 2092992 BpfTester.cpp:258] Test: packet to TCP based v4 VIP (and v4 real) + ToS in IPV4 result: Failed
I1110 06:19:18.503299 2092992 BpfTester.cpp:258] Test: packet to TCP based v4 VIP (and v4 real; any dst ports). result: Failed
I1110 06:19:18.503373 2092992 BpfTester.cpp:258] Test: packet to TCP based v4 VIP (and v6 real) result: Passed
I1110 06:19:18.503445 2092992 BpfTester.cpp:258] Test: packet to TCP based v6 VIP (and v6 real) result: Passed
I1110 06:19:18.503520 2092992 BpfTester.cpp:258] Test: packet to TCP based v6 VIP (and v6 real) with ToS / tc set result: Passed
I1110 06:19:18.503589 2092992 BpfTester.cpp:258] Test: v4 ICMP echo-request result: Passed
I1110 06:19:18.503660 2092992 BpfTester.cpp:258] Test: v6 ICMP echo-request result: Passed
I1110 06:19:18.503734 2092992 BpfTester.cpp:258] Test: v4 ICMP dest-unreachabe fragmentation-needed result: Failed
I1110 06:19:18.503806 2092992 BpfTester.cpp:258] Test: v6 ICMP packet-too-big result: Passed
I1110 06:19:18.503876 2092992 BpfTester.cpp:258] Test: drop of IPv4 packet w/ options result: Passed
I1110 06:19:18.503949 2092992 BpfTester.cpp:258] Test: drop of IPv4 fragmented packet result: Passed
I1110 06:19:18.504019 2092992 BpfTester.cpp:258] Test: drop of IPv6 fragmented packet result: Passed
I1110 06:19:18.504101 2092992 BpfTester.cpp:258] Test: pass of v4 packet with dst not equal to any configured VIP result: Passed
I1110 06:19:18.504171 2092992 BpfTester.cpp:258] Test: pass of v6 packet with dst not equal to any configured VIP result: Passed
I1110 06:19:18.504242 2092992 BpfTester.cpp:258] Test: pass of arp packet result: Passed
I1110 06:19:18.504323 2092992 BpfTester.cpp:258] Test: LRU hit result: Failed
I1110 06:19:18.504395 2092992 BpfTester.cpp:258] Test: packet #1 dst port hashing only result: Failed
I1110 06:19:18.504465 2092992 BpfTester.cpp:258] Test: packet #2 dst port hashing only result: Failed
I1110 06:19:18.504537 2092992 BpfTester.cpp:258] Test: ipinip packet result: Passed
I1110 06:19:18.504608 2092992 BpfTester.cpp:258] Test: ipv6inipv6 packet result: Passed
I1110 06:19:18.504679 2092992 BpfTester.cpp:258] Test: ipv4inipv6 packet result: Passed
I1110 06:19:18.504752 2092992 BpfTester.cpp:258] Test: QUIC: long header. Client Initial type. LRU miss result: Failed
I1110 06:19:18.504825 2092992 BpfTester.cpp:258] Test: QUIC: long header. 0-RTT Protected. CH. LRU hit. result: Failed
I1110 06:19:18.504896 2092992 BpfTester.cpp:258] Test: QUIC: long header. Handshake. v4 vip v6 real. Conn Id V1 based. server id is 1024 mapped to fc00::1. result: Passed
I1110 06:19:18.504972 2092992 BpfTester.cpp:258] Test: QUIC: long header. Retry. v4 vip v6 real. Conn Id V1 based. server id is 1024 mapped to fc00::1. result: Passed
I1110 06:19:18.505044 2092992 BpfTester.cpp:258] Test: QUIC: long header. client initial. v6 vip v6 real. LRU miss result: Passed
I1110 06:19:18.505115 2092992 BpfTester.cpp:258] Test: QUIC: short header. No connection id. LRU hit result: Passed
I1110 06:19:18.505187 2092992 BpfTester.cpp:258] Test: QUIC: short header w/ connection id result: Passed
I1110 06:19:18.505256 2092992 BpfTester.cpp:258] Test: QUIC: short header w/ connection id 1092 but non-existing mapping. LRU hit result: Passed
I1110 06:19:18.505326 2092992 BpfTester.cpp:258] Test: QUIC: short header w/ conn id. host id = 0. LRU hit result: Passed
I1110 06:19:18.505424 2092992 BpfTester.cpp:258] Test: UDP: big packet of length 1515. trigger PACKET TOOBIG result: Passed
I1110 06:19:18.505493 2092992 BpfTester.cpp:258] Test: QUIC: short header w/ connection id. CIDv2 result: Passed
I1110 06:19:18.505565 2092992 BpfTester.cpp:258] Test: QUIC: short header w/ connection id 197700 but non-existing mapping. CIDv2. LRU hit. result: Passed
I1110 06:19:18.505571 2092992 katran_tester.cpp:192] Testing counter's sanity. Printing on errors only
E1110 06:19:18.505623 2092992 katran_tester.cpp:216] LRU fallback counter is incorrect
E1110 06:19:18.505638 2092992 KatranLb.cpp:1965] Invalid server id 197700 in quic packet
E1110 06:19:18.505643 2092992 katran_tester.cpp:230] Counters for QUIC packets routed with CH: 4, with connection-id: 4
E1110 06:19:18.505647 2092992 katran_tester.cpp:233] Counters for routing of QUIC packets is wrong.
E1110 06:19:18.505651 2092992 katran_tester.cpp:245] QUIC CID drop counters v1 23 v2 0
E1110 06:19:18.505653 2092992 katran_tester.cpp:247] Counters for QUIC drops are wrong
I1110 06:19:18.505664 2092992 katran_tester.cpp:262] incorrect stats for real: 10.0.0.2
I1110 06:19:18.505667 2092992 katran_tester.cpp:263] Expected to be incorrect w/ non default build flags
I1110 06:19:18.505676 2092992 katran_tester.cpp:262] incorrect stats for real: fc00::1
I1110 06:19:18.505678 2092992 katran_tester.cpp:263] Expected to be incorrect w/ non default build flags
I1110 06:19:18.505681 2092992 katran_tester.cpp:262] incorrect stats for real: fc00::2
I1110 06:19:18.505684 2092992 katran_tester.cpp:263] Expected to be incorrect w/ non default build flags
I1110 06:19:18.505695 2092992 katran_tester.cpp:280] Testing of counters is complete
E1110 06:19:18.506040 2092992 KatranSimulator.cpp:168] src and dst must have same address family
E1110 06:19:18.506045 2092992 KatranSimulator.cpp:161] malformed src or dst ip address. src: aaaa dst: bbbb
E1110 06:19:18.506048 2092992 BpfLoader.cpp:80] Can't find prog with name: healthcheck_encap
I1110 06:19:18.506052 2092992 katran_tester.cpp:114] Healthchecking not enabled. Skipping HC related tests
12ns per packet are not realistic. You probably forget to set inline_decap define while was building bpf program. Also what are not custom values for defines you have ? Tests would show pass only with default defines