trex-core Multiple streams with different VLAN priority causes high CPU utilization

Hello all,

I am facing a strange issue in the Trex stateless code, version v3.02. I am using the Mellanox Cx-5, and have created two VFs on top of the PF 0.

I am trying to create two parallel streams with different VLAN priorities, but the load generated is not what I expect it to be, and CPU util. seems incredibly high.

I have attached the output of the tui when sending only one stream (trex_good.png) and sending both streams (trex_bad.png). Additionally, I have added the tui output of the utilization (trex_util.png) of the "bad" scenario.

I have reproduced this issue with the --software and non software version.

trex_good trex_bad trex_util

The script used is below, and I am calling it with python3 automation/trex_control_plane/interactive/trex/examples/stl/single.py.

import stl_path
from trex.stl.api import *

import time
import pprint
from ipaddress import ip_address, ip_network

import argparse
import configparser
import os
import json


def get_packet(tos, mac_dst, ip_src, size):
    # pkt = Ether(src="02:00:00:00:00:01",dst="00:00:00:01:00:01") / IP(src="10.0.0.2", tos=tos) / UDP(sport=4444, dport=4444)

    pkt = (
        Ether(src="00:01:00:00:00:02", dst=mac_dst)
        # Ether(dst="11:11:11:11:11:11")
        # / Dot1AD(vlan=0)
        / Dot1Q(vlan=0, prio=tos)
        / IP(src=ip_src)
        / UDP(sport=4444, dport=4444)
    )
    pad = max(0, size - len(pkt)) * "x"

    return pkt / pad

def main():
    """ """
    tx_port = 0
    rx_port = 1

    c = STLClient()

    # connect to server
    c.connect()

    # prepare our ports
    c.reset(ports=[tx_port, rx_port])

    streams = []
    s = STLStream(
        packet=STLPktBuilder(
            pkt=get_packet(4,"00:11:22:33:44:55", "10.1.0.2",512),
            # vm = vm,
        ),
        isg=0 * 1000000,
        mode=STLTXCont(pps=1.2*10**6),
        # flow_stats = STLFlowLatencyStats(pg_id = 0)
        flow_stats = STLFlowStats(pg_id=0),
    )

    streams.append(s)

    s2 = STLStream(
        packet=STLPktBuilder(
            pkt=get_packet(2,"00:11:22:33:44:55", "10.1.0.2",512),
            # vm = vm,
        ),
        isg=0 * 1000000,
        mode=STLTXCont(pps=1.2*10**6),
        # flow_stats = STLFlowLatencyStats(pg_id = 0)
        flow_stats = STLFlowStats(pg_id=1),
    )

    streams.append(s2)

    c.add_streams(streams, ports=[tx_port])

    c.clear_stats()

    c.start(ports=[tx_port], duration=60, mult="25gbpsl1")

    c.wait_on_traffic(ports=[tx_port, rx_port])

    stats = c.get_stats()
    print(stats)

if __name__ == "__main__":
    main()

The following is my configuration

- port_limit: 2
  version: 2
  port_bandwidth_gb: 100
  interfaces: ["3b:00.2", "3b:00.3"]
  port_info:
    - dest_mac: 00:00:00:00:00:01
      src_mac: 00:01:00:00:00:01
    - dest_mac: 00:00:00:00:00:02
      src_mac: 00:01:00:00:00:02
  c: 14
  platform:
    master_thread_id: 8
    latency_thread_id: 27
    dual_if:
      - socket: 0
        threads: [9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]

Thank you!

Apr 16 '24 06:04 rubensfig

I have also tested this in v3.04 and the bug remains.

Would appreciate if anyone could provide some help on the issue :)

Apr 16 '24 07:04 rubensfig

@rubensfig the issue is a DPDK mlx5 driver issue, I would report it to the maintainers on DPDK forum

Apr 16 '24 07:04 hhaim

@hhaim Thank you for the pointer. I have posted it on [email protected].

Should I keep this ticket open, or close and re-open once the DPDK upstream gets resolved?

Apr 16 '24 08:04 rubensfig

@rubensfig I would keep it and update it if there is a new info from the maintainers .. mlx5 driver is a complex one with many dependencies

Apr 16 '24 08:04 hhaim

Hello @hhaim, everyone!

I have obtained some support from the DPDK mailing list, here is the relevant comment with the solution. https://mails.dpdk.org/archives/users/2024-April/007635.html

Essentially, we need to make sure the NIC-level QoS parameters are set. I am pasting the relevant commands below, from the DPDK thread.

sudo mlnx_qos -i <iface> --trust=dscp
for dscp in {0..63}; do sudo mlnx_qos -i <iface> --dscp2prio set,$dscp,0; sleep 0.001;done

I can create a documentation note about this in the Mellanox annex, under the limitations/issues section. What would you think? https://trex-tgn.cisco.com/trex/doc/trex_appendix_mellanox.html

Apr 19 '24 12:04 rubensfig

@rubensfig thanks for looking into it. It would be great to add the annex to this command and please refer to the version of trex/ofed/dpdk. so mlx5 become even more complex now ..

Apr 21 '24 07:04 hhaim

trex-core trex-core copied to clipboard

Multiple streams with different VLAN priority causes high CPU utilization

trex-core
trex-core copied to clipboard