MoonGen
MoonGen copied to clipboard
replay-pcap.lua: Using intervals from file segfaults
This is a continuation of #192 and uses the same pcap, but with a 82580 NIC instead.
Running
sudo build/MoonGen examples/pcap/replay-pcap.lua -r 1 -l 3 ea200usec.pcap
sometimes segfaults. Most times it just doesn't send out any traffic as with the I210 in #192.
[INFO] Initializing DPDK. This will take a few seconds...
EAL: Detected 8 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: PCI device 0000:00:19.0 on NUMA socket 0
EAL: probe driver: 8086:153a net_e1000_em
EAL: PCI device 0000:01:00.0 on NUMA socket 0
EAL: probe driver: 8086:1533 net_e1000_igb
EAL: PCI device 0000:02:00.0 on NUMA socket 0
EAL: probe driver: 8086:150e net_e1000_igb
EAL: PCI device 0000:02:00.1 on NUMA socket 0
EAL: probe driver: 8086:150e net_e1000_igb
EAL: PCI device 0000:02:00.2 on NUMA socket 0
EAL: probe driver: 8086:150e net_e1000_igb
EAL: PCI device 0000:02:00.3 on NUMA socket 0
EAL: probe driver: 8086:150e net_e1000_igb
EAL: PCI device 0000:03:00.0 on NUMA socket 0
EAL: probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:03:00.1 on NUMA socket 0
EAL: probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:03:00.2 on NUMA socket 0
EAL: probe driver: 8086:1521 net_e1000_igb
EAL: PCI device 0000:03:00.3 on NUMA socket 0
EAL: probe driver: 8086:1521 net_e1000_igb
[INFO] Found 6 usable devices:
Device 0: A0:36:9F:62:86:4F (Intel Corporation I210 Gigabit Network Connection)
Device 1: 00:1B:21:A6:D1:BC (Intel Corporation 82580 Gigabit Network Connection)
Device 2: 00:1B:21:A6:D1:BD (Intel Corporation 82580 Gigabit Network Connection)
Device 3: 00:1B:21:A6:D1:BE (Intel Corporation 82580 Gigabit Network Connection)
Device 4: 00:1B:21:A6:D1:BF (Intel Corporation 82580 Gigabit Network Connection)
Device 5: A0:36:9F:A3:87:F6 (Intel Corporation I350 Gigabit Network Connection)
[INFO] Waiting for devices to come up...
[INFO] Device 3 (00:1B:21:A6:D1:BE) is up: 1000 MBit/s
[INFO] 1 device is up.
[Device: id=3] TX: 0.00 Mpps, 3 Mbit/s (3 Mbit/s with framing)
[Device: id=3] TX: 0.00 Mpps, 3 Mbit/s (3 Mbit/s with framing)
[Device: id=3] TX: 0.00 Mpps, 3 Mbit/s (3 Mbit/s with framing)
[Device: id=3] TX: 0.00 Mpps, 3 Mbit/s (3 Mbit/s with framing)
[Device: id=3] TX: 0.00 Mpps, 3 Mbit/s (3 Mbit/s with framing)
[Device: id=3] TX: 0.00 Mpps, 3 Mbit/s (3 Mbit/s with framing)
Segmentation fault
The core's stack trace is bogus:
(gdb) bt
#0 0x000055d88e599dd3 in ?? ()
#1 0x404f800000000000 in ?? ()
#2 0x402a000000000000 in ?? ()
#3 0x000055d87b1446c0 in ?? ()
#4 0x000000000000001a in ?? ()
#5 0x0000000000000001 in ?? ()
#6 0x4029724840739378 in ?? ()
#7 0x0000000000000000 in ?? ()
Not sure how useful, but here's the stack:
(gdb) x/32ga $rsp
0x7f98307fcb00: 0x404f800000000000 0x402a000000000000
0x7f98307fcb10: 0x563a4a6d86c0 <lcore_config> 0x1a
0x7f98307fcb20: 0x1 0x41e415c841c73378
0x7f98307fcb30: 0x0 0xc
0x7f98307fcb40: 0x563a4a6d8860 <lcore_config+416> 0x7f98307fcc1f
0x7f98307fcb50: 0x7f98307fcbf0 0x563a4a12ee41 <lua_pcall+177>
0x7f98307fcb60: 0x40636ed0 0x40943340
0x7f98307fcb70: 0x200000000 0x41c73378
0x7f98307fcb80: 0x40943340 0x41c73378
0x7f98307fcb90: 0x40636ed0 0x40943340
0x7f98307fcba0: 0x41c733b8 0x0
0x7f98307fcbb0: 0x7f98307fcbf0 0x563a4a0f3eab <libmoon::lua_core_main(void*)+166>
0x7f98307fcbc0: 0x0 0x563a4bfbbf10
0x7f98307fcbd0: 0x563a4bfbbf10 0x41c73378
0x7f98307fcbe0: 0x1a 0x7f98307fcc1f
0x7f98307fcbf0: 0xf 0x563a4a202d1d <eal_thread_loop+477>
This doesn't happen when the -r 1 is dropped.
can you please post your full hardware configuration and the linux distribution that you are using?
Debian GNU/Linux 9.2 (stretch), running kernel 4.9.30-rt20 (PREEMPT RT patch). Running on an Intel(R) Xeon(R) CPU E5-1620 v3. Test was with an Intel 82580 which was connected over a TAP to the other port of the same 82580.
What do you mean with full hardware configuration? lshw(1) output?
good/bad news: I've found a system where this is reproducible, will investigate. looks like the ring interface between the threads is doing something bad
Hello, any news on this? Is there something I can do to help? (besides fixing the bug, I don't know enough about DPDK to do that :-)
Hi all, same problem here. Running Intel Corporation 82599ES 10-Gigabit SFI/SFP+ on a DELL9020 i7-4790, 32G RAM. DPDK and Moongen setup all configured and setup and works as expected. Using swapped out kernel drivers as per DPDK Kernel driver in use: igb_uio Kernel modules: ixgbe Running NAME="Ubuntu" VERSION="20.04.2 LTS (Focal Fossa)" Linux dpdk-injector 5.8.0-59-generic #66~20.04.1-Ubuntu SMP Thu Jun 17 11:14:10 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux