trafficgen
trafficgen copied to clipboard
Trex [v2.86] [2.87] dpdk_setup_ports.py script failed to generate server config file with multiple PCI info
Description of problem:
Specific to Trex version 2.86 and 2.87, the dpdk_setup_ports.py
failed to generate trex_cfg.yml
file if we bind multiple PCI info which belongs to multiple NUMA.
Version-Release number of selected component (if applicable): Trex v2.86 Trex v2.87
How reproducible: pbench-trafficgen test
Steps to Reproduce:
- CPU Details:
# lscpu|grep CPU
CPU op-mode(s): 32-bit, 64-bit
CPU(s): 56
On-line CPU(s) list: 0-55
CPU family: 6
Model name: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
CPU MHz: 2600.092
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55
- Tuned Config
# grep ^[^#] /etc/tuned/cpu-partitioning-variables.conf
isolated_cores=6-27,34-55
no_balance_cores=6-27,34-55
- Installed multiple Trex version for validation
# ll /opt/trex/
total 12
lrwxrwxrwx. 1 root root 5 Jan 19 13:31 current -> v2.82
drwxr-xr-x. 18 33066 25 4096 Jan 18 04:41 v2.81
drwxr-xr-x. 18 33066 25 4096 Jan 14 17:30 v2.82
drwxr-xr-x. 18 33066 25 4096 Jan 18 04:35 v2.87
- Trfficgen failed if we choose the v2.87 as it failed to generate trex_cfg.yml.
# pwd;./dpdk_setup_ports.py -c 06:00.0 06:00.1 08:00.0 08:00.1 85:00.0 85:00.1 83:00.0 83:00.1 --cores-include 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 -o /tmp/trex_cfg.yaml --no-ht
/opt/trex/v2.87
Error upon running TRex to get interfaces info:
Starting TRex v2.87 please wait ...
EAL: Could not find space for memseg. Please increase CONFIG_RTE_MAX_MEMSEG_PER_TYPE and/or CONFIG_RTE_MAX_MEM_PER_TYPE in configuration.
EAL: Couldn't remap hugepage files into memseg lists
EAL: FATAL: Cannot init memory
EAL: Cannot init memory
You might need to run ./trex-cfg once
EAL: Error - exiting with code: 1
Cause: Invalid EAL arguments
- The issue doesn't reproduce in v2.82
# pwd;./dpdk_setup_ports.py -c 06:00.0 06:00.1 08:00.0 08:00.1 85:00.0 85:00.1 83:00.0 83:00.1 --cores-include 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 -o /tmp/trex_cfg.yaml --no-ht
/opt/trex/v2.82
Saved to /tmp/trex_cfg.yaml.
# cat /tmp/trex_cfg.yaml
### Config file generated by dpdk_setup_ports.py ###
- version: 2
interfaces: ['06:00.0', '06:00.1', '08:00.0', '08:00.1', '85:00.0', '85:00.1', '83:00.0', '83:00.1']
port_bandwidth_gb: 25
port_info:
- ip: 1.1.1.1
default_gw: 2.2.2.2
- ip: 2.2.2.2
default_gw: 1.1.1.1
- ip: 3.3.3.3
default_gw: 4.4.4.4
- ip: 4.4.4.4
default_gw: 3.3.3.3
- ip: 5.5.5.5
default_gw: 6.6.6.6
- ip: 6.6.6.6
default_gw: 5.5.5.5
- ip: 7.7.7.7
default_gw: 8.8.8.8
- ip: 8.8.8.8
default_gw: 7.7.7.7
platform:
master_thread_id: 27
latency_thread_id: 26
dual_if:
- socket: 0
threads: [6,8,10,12,14]
- socket: 0
threads: [16,18,20,22,24]
- socket: 1
threads: [7,9,11,13,15]
- socket: 1
threads: [17,19,21,23,25]
Expected results: dpdk_setup_ports.py
script needs to work for multiple PCI.
@pradiptapks do you have 1GB hugepages in both NUMA nodes?
@atheurer Yes, Hugepage properly configured.
# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-4.18.0-193.40.1.el8_2.x86_64 root=/dev/mapper/rhel_perf122-root ro crashkernel=auto resume=/dev/mapper/rhel_perf122-swap rd.lvm.lv=rhel_perf122/root rd.lvm.lv=rhel_perf122/swap skew_tick=1 nohz=on nohz_full=6-27,34-55 rcu_nocbs=6-27,34-55 tuned.non_isolcpus=00000003,f000003f intel_pstate=disable nosoftlockup default_hugepagesz=1GB hugepagesz=1G hugepages=100 iommu=pt intel_iommu=on
# cat /sys/devices/system/node/node*/hugepages/hugepages-1048576kB/nr_hugepages
50
50
# grep -i huge /proc/meminfo
AnonHugePages: 4096 kB
ShmemHugePages: 0 kB
HugePages_Total: 100
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
Hugetlb: 104857600 kB
JFYI- As currently, Trex server is running with v2.82, so there are no free hugepage available. Also deleting rtemap_*
files doesn't solve this issue.
# ll /dev/hugepages/|grep -v total |wc -l
100
Also, initiated a discussion with following Trex group for resolution. https://groups.google.com/g/trex-tgn/c/7c62GKwmRhE
@k-rister let's use this existing issue for the trex config work you are fixing.
Sounds good. For context, here is a link to TRex upstream discussion:
https://groups.google.com/g/trex-tgn/c/XmAubHOSWoM/m/s4ZyrokoAwAJ