trex-core
trex-core copied to clipboard
Multiple streams with different VLAN priority causes high CPU utilization
Hello all,
I am facing a strange issue in the Trex stateless code, version v3.02. I am using the Mellanox Cx-5, and have created two VFs on top of the PF 0.
I am trying to create two parallel streams with different VLAN priorities, but the load generated is not what I expect it to be, and CPU util. seems incredibly high.
I have attached the output of the tui when sending only one stream (trex_good.png) and sending both streams (trex_bad.png). Additionally, I have added the tui output of the utilization (trex_util.png) of the "bad" scenario.
I have reproduced this issue with the --software and non software version.
The script used is below, and I am calling it with python3 automation/trex_control_plane/interactive/trex/examples/stl/single.py
.
import stl_path
from trex.stl.api import *
import time
import pprint
from ipaddress import ip_address, ip_network
import argparse
import configparser
import os
import json
def get_packet(tos, mac_dst, ip_src, size):
# pkt = Ether(src="02:00:00:00:00:01",dst="00:00:00:01:00:01") / IP(src="10.0.0.2", tos=tos) / UDP(sport=4444, dport=4444)
pkt = (
Ether(src="00:01:00:00:00:02", dst=mac_dst)
# Ether(dst="11:11:11:11:11:11")
# / Dot1AD(vlan=0)
/ Dot1Q(vlan=0, prio=tos)
/ IP(src=ip_src)
/ UDP(sport=4444, dport=4444)
)
pad = max(0, size - len(pkt)) * "x"
return pkt / pad
def main():
""" """
tx_port = 0
rx_port = 1
c = STLClient()
# connect to server
c.connect()
# prepare our ports
c.reset(ports=[tx_port, rx_port])
streams = []
s = STLStream(
packet=STLPktBuilder(
pkt=get_packet(4,"00:11:22:33:44:55", "10.1.0.2",512),
# vm = vm,
),
isg=0 * 1000000,
mode=STLTXCont(pps=1.2*10**6),
# flow_stats = STLFlowLatencyStats(pg_id = 0)
flow_stats = STLFlowStats(pg_id=0),
)
streams.append(s)
s2 = STLStream(
packet=STLPktBuilder(
pkt=get_packet(2,"00:11:22:33:44:55", "10.1.0.2",512),
# vm = vm,
),
isg=0 * 1000000,
mode=STLTXCont(pps=1.2*10**6),
# flow_stats = STLFlowLatencyStats(pg_id = 0)
flow_stats = STLFlowStats(pg_id=1),
)
streams.append(s2)
c.add_streams(streams, ports=[tx_port])
c.clear_stats()
c.start(ports=[tx_port], duration=60, mult="25gbpsl1")
c.wait_on_traffic(ports=[tx_port, rx_port])
stats = c.get_stats()
print(stats)
if __name__ == "__main__":
main()
The following is my configuration
- port_limit: 2
version: 2
port_bandwidth_gb: 100
interfaces: ["3b:00.2", "3b:00.3"]
port_info:
- dest_mac: 00:00:00:00:00:01
src_mac: 00:01:00:00:00:01
- dest_mac: 00:00:00:00:00:02
src_mac: 00:01:00:00:00:02
c: 14
platform:
master_thread_id: 8
latency_thread_id: 27
dual_if:
- socket: 0
threads: [9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]
Thank you!
I have also tested this in v3.04 and the bug remains.
Would appreciate if anyone could provide some help on the issue :)
@rubensfig the issue is a DPDK mlx5 driver issue, I would report it to the maintainers on DPDK forum
@hhaim Thank you for the pointer. I have posted it on [email protected].
Should I keep this ticket open, or close and re-open once the DPDK upstream gets resolved?
@rubensfig I would keep it and update it if there is a new info from the maintainers .. mlx5 driver is a complex one with many dependencies
Hello @hhaim, everyone!
I have obtained some support from the DPDK mailing list, here is the relevant comment with the solution. https://mails.dpdk.org/archives/users/2024-April/007635.html
Essentially, we need to make sure the NIC-level QoS parameters are set. I am pasting the relevant commands below, from the DPDK thread.
sudo mlnx_qos -i <iface> --trust=dscp
for dscp in {0..63}; do sudo mlnx_qos -i <iface> --dscp2prio set,$dscp,0; sleep 0.001;done
I can create a documentation note about this in the Mellanox annex, under the limitations/issues section. What would you think? https://trex-tgn.cisco.com/trex/doc/trex_appendix_mellanox.html
@rubensfig thanks for looking into it. It would be great to add the annex to this command and please refer to the version of trex/ofed/dpdk. so mlx5 become even more complex now ..