libxlio
libxlio copied to clipboard
No traffic sent on Bluefield-2
Subject
Dear Team,
I'm trying to benchmark NVMe-oF over TCP/IP with and without XLIO. I'm able to get iperf and spdk_perf working between machines but when XLIO is used, no traffic is coming out of the initiator. It's not visible in either tcpdump, ethtool stats, switch that's in-between the machines or the target machine.
An example packet, that the spdk_perf tries to send is ARP:
#0 qp_mgr_eth_mlx5::fill_wqe (this=0xaaaaaada6010, pswr=
) at dev/qp_mgr_eth_mlx5.cpp:485 #1 0x0000fffff7024db0 in qp_mgr_eth_mlx5::send_to_wire (this=0xaaaaaada6010, p_send_wqe= , attr= , request_comp= , tis= , credits= ) at dev/qp_mgr_eth_mlx5.cpp:748 #2 0x0000fffff7020ac8 in qp_mgr::send (this=0xaaaaaada6010, p_send_wqe=p_send_wqe@entry=0xaaaaaada4510, attr=attr@entry=0, tis=tis@entry=0x0, credits=credits@entry=2) at dev/qp_mgr.cpp:611 #3 0x0000fffff704d9b0 in ring_simple::send_buffer (tis=0x0, attr= , p_send_wqe=0xaaaaaada4510, this=0xaaaaaada4920) at dev/ring_simple.cpp:746 #4 ring_simple::send_ring_buffer (this=0xaaaaaada4920, id= , p_send_wqe=0xaaaaaada4510, attr= ) at dev/ring_simple.cpp:776 #5 0x0000fffff7077794 in neigh_eth::send_arp_request (this=0xaaaaaada4370, is_broadcast= ) at proto/neighbour.cpp:1661 #6 0x0000fffff70724a4 in neigh_entry::send_discovery_request (this=0xaaaaaada4370) at proto/neighbour.cpp:393
After this, it successfully gets completion in:
#0 cq_mgr_mlx5::poll_and_process_element_tx (this=0xaaaaaada63d0, p_cq_poll_sn=0xffffffffd680) at dev/cq_mgr_mlx5.cpp:542 #1 0x0000fffff7020a68 in qp_mgr::send (this=0xaaaaaada60d0, p_send_wqe=p_send_wqe@entry=0xaaaaaada4700, attr=attr@entry=0, tis=tis@entry=0x0, credits=credits@entry=2) at dev/qp_mgr.cpp:605 #2 0x0000fffff704d9b0 in ring_simple::send_buffer (tis=0x0, attr=
, p_send_wqe=0xaaaaaada4700, this=0xaaaaaada4b10) at dev/ring_simple.cpp:746 #3 ring_simple::send_ring_buffer (this=0xaaaaaada4b10, id= , p_send_wqe=0xaaaaaada4700, attr= ) at dev/ring_simple.cpp:776 #4 0x0000fffff7077794 in neigh_eth::send_arp_request (this=0xaaaaaada4560, is_broadcast= ) at proto/neighbour.cpp:1661 #5 0x0000fffff70724a4 in neigh_entry::send_discovery_request (this=0xaaaaaada4560) at proto/neighbour.cpp:393
I've checked the device It's using for the ARP and it looks correct - p1 (it's the name of physical function interface on Bluefield). Thanks in advance for any help.
Cheers
Issue type
- [x] Bug report
- [ ] Feature request
Configuration:
-
Product version XLIO_VERSION: 3.21.2-0 Development Snapshot built on Mar 22 2024 12:01:27 -- DEBUG --
Git: d4767597a6478f5fddb6bc21dbb39455447cb627 -
OS Distributor ID: Ubuntu Description: Ubuntu 20.04.6 LTS Release: 20.04 Codename: focal
-
OFED
MLNX_OFED_LINUX-5.8-3.0.5.0 (OFED-5.8-3.0.5) -
Hardware Bluefield-2 MBF2M516A-CEEO_Ax_Bx (2x100Gbs)
Actual behavior:
No traffic coming out of the network interface even though WQ is posted and CQ is received.
Expected behavior:
SPDK perf or iperf are able to connect and send traffic
Steps to reproduce:
sudo LD_PRELOAD=/opt/mellanox/libxlio/lib/libxlio.so iperf -t 30 -c 20.20.20.4 -m -P 1 -i 1 -M 1500 or sudo SPDK_XLIO_PATH=/opt/mellanox/libxlio/lib/libxlio.so XLIO_TRACELEVEL=DEBUG ~/spdk-23.01/build/examples/perf -q 64 -o $((2**12)) -w randread -r 'trtype:nvda_tcp adrfam:IPv4 traddr:20.20.20.4 trsvcid:4420' -t 300 -c 0x01 --transport-stats -G --default-sock-impl xlio