libvma
libvma copied to clipboard
issue: errno=111 Connection refused
I run sockperf test referring to the doc https://docs.nvidia.com/networking/display/vmav952/running+vma
I got two hosts with mellanox cx5 with dual-port
~# ethtool -i ens6f0np0
driver: mlx5_core
version: 23.07-0.5.1
firmware-version: 16.27.6008 (LNV0000000033)
expansion-rom-version:
bus-info: 0000:af:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
# show_gids
DEV PORT INDEX GID IPv4 VER DEV
--- ---- ----- --- ------------ --- ---
mlx5_0 1 0 fe80:0000:0000:0000:063f:72ff:fed0:cee6 v1 ens6f0np0
mlx5_0 1 1 fe80:0000:0000:0000:063f:72ff:fed0:cee6 v2 ens6f0np0
mlx5_0 1 2 0000:0000:0000:0000:0000:ffff:ac51:000a 172.81.0.10 v1 ens6f0np0
mlx5_0 1 3 0000:0000:0000:0000:0000:ffff:ac51:000a 172.81.0.10 v2 ens6f0np0
mlx5_0 1 4 fd00:0081:0000:0000:0172:0081:0000:0010 v1 ens6f0np0
mlx5_0 1 5 fd00:0081:0000:0000:0172:0081:0000:0010 v2 ens6f0np0
mlx5_1 1 0 fe80:0000:0000:0000:063f:72ff:fed0:cee7 v1 ens6f1np1
mlx5_1 1 1 fe80:0000:0000:0000:063f:72ff:fed0:cee7 v2 ens6f1np1
mlx5_1 1 10 fd00:0090:0000:0000:0000:0000:0000:0010 v1 ens6f1np1.90
mlx5_1 1 11 fd00:0090:0000:0000:0000:0000:0000:0010 v2 ens6f1np1.90
mlx5_1 1 2 0000:0000:0000:0000:0000:ffff:ac52:000a 172.82.0.10 v1 ens6f1np1
mlx5_1 1 3 0000:0000:0000:0000:0000:ffff:ac52:000a 172.82.0.10 v2 ens6f1np1
mlx5_1 1 4 fd00:0082:0000:0000:0172:0082:0000:0010 v1 ens6f1np1
mlx5_1 1 5 fd00:0082:0000:0000:0172:0082:0000:0010 v2 ens6f1np1
mlx5_1 1 6 fe80:0000:0000:0000:063f:72ff:fed0:cee7 v1 ens6f1np1.90
mlx5_1 1 7 fe80:0000:0000:0000:063f:72ff:fed0:cee7 v2 ens6f1np1.90
mlx5_1 1 8 0000:0000:0000:0000:0000:ffff:ac5a:000a 172.90.0.10 v1 ens6f1np1.90
mlx5_1 1 9 0000:0000:0000:0000:0000:ffff:ac5a:000a 172.90.0.10 v2 ens6f1np1.90
mlx5_2 1 0 fe80:0000:0000:0000:9888:49ff:fed9:428f v1 ens6f0v0
mlx5_2 1 1 fe80:0000:0000:0000:9888:49ff:fed9:428f v2 ens6f0v0
mlx5_3 1 0 fe80:0000:0000:0000:408d:07ff:feb3:0a9b v1 ens6f0v1
mlx5_3 1 1 fe80:0000:0000:0000:408d:07ff:feb3:0a9b v2 ens6f0v1
mlx5_4 1 0 fe80:0000:0000:0000:14ab:adff:fef9:16d7 v1 ens6f0v2
mlx5_4 1 1 fe80:0000:0000:0000:14ab:adff:fef9:16d7 v2 ens6f0v2
mlx5_5 1 0 fe80:0000:0000:0000:0891:f4ff:febc:46e2 v1 ens6f0v3
mlx5_5 1 1 fe80:0000:0000:0000:0891:f4ff:febc:46e2 v2 ens6f0v3
n_gids_found=26
# uname -a
Linux 10-20-1-10 5.15.0-86-generic #96-Ubuntu SMP Wed Sep 20 08:23:49 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
# cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04 (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
I could succeed to run sockperf between two hosts all the times
on host 172.81.0.20
# sockperf sr --tcp -i 172.81.0.20 -p 15000
sockperf: == version #3.7-no.git ==
sockperf: [SERVER] listen on:
[ 0] IP = 172.81.0.20 PORT = 15000 # TCP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: [tid 2797496] using recvfrom() to block on socket(s)
on client host 172.81.0.10
# sockperf pp --tcp -i 172.81.0.20 -p 15000 -t 1
sockperf: == version #3.10-no.git ==
sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)
[ 0] IP = 172.81.0.20 PORT = 15000 # TCP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=1.000 sec; Warm up time=400 msec; SentMessages=40500; ReceivedMessages=40499
sockperf: ========= Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=0.550 sec; SentMessages=23199; ReceivedMessages=23199
sockperf: ====> avg-latency=11.813 (std-dev=1.441, mean-ad=0.623, median-ad=0.487, siqr=0.337, cv=0.122, std-error=0.009, 99.0% ci=[11.789, 11.837])
sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
sockperf: Summary: Latency is 11.813 usec
sockperf: Total 23199 observations; each percentile contains 231.99 observations
sockperf: ---> <MAX> observation = 107.870
sockperf: ---> percentile 99.999 = 107.870
sockperf: ---> percentile 99.990 = 60.648
sockperf: ---> percentile 99.900 = 21.000
sockperf: ---> percentile 99.000 = 17.315
sockperf: ---> percentile 90.000 = 12.599
sockperf: ---> percentile 75.000 = 11.947
sockperf: ---> percentile 50.000 = 11.564
sockperf: ---> percentile 25.000 = 11.272
sockperf: ---> <MIN> observation = 10.458
but I failed to run with libvma sometimes
on host 172.81.0.20
# LD_PRELOAD=libvma.so sockperf sr --tcp -i 172.81.0.20 -p 15000
VMA INFO: ---------------------------------------------------------------------------
VMA INFO: VMA_VERSION: 9.8.31-0 Development Snapshot built on Oct 10 2023 11:31:55 -*- DEBUG -*-
VMA INFO: Cmd Line: sockperf sr --tcp -i 172.81.0.20 -p 15000
VMA INFO: Current Time: Tue Oct 10 12:05:15 2023
VMA INFO: Pid: 2813781
VMA INFO: OFED Version: MLNX_OFED_LINUX-23.07-0.5.1.2:
VMA INFO: Architecture: x86_64
VMA INFO: Node: 10-20-1-20
VMA INFO: ---------------------------------------------------------------------------
VMA INFO: Log Level INFO [VMA_TRACELEVEL]
VMA INFO: ---------------------------------------------------------------------------
^@ VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.81.0.110 netmask: 255.255.255.255 dev: veth9878877221a table :500 scope 253 type 1 index 43 scope 253 type 1 index 43
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.81.0.119 netmask: 255.255.255.255 dev: vethd861c2e0cf5 table :500 scope 253 type 1 index 34 scope 253 type 1 index 34
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.82.0.104 netmask: 255.255.255.255 dev: vethf917a4f52ae table :500 scope 253 type 1 index 42 scope 253 type 1 index 42
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.82.0.114 netmask: 255.255.255.255 dev: cali21b37a164ee table :500 scope 253 type 1 index 40 scope 253 type 1 index 40
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.90.0.108 netmask: 255.255.255.255 dev: veth9810b9fa995 table :500 scope 253 type 1 index 44 scope 253 type 1 index 44
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.240.0 netmask: 255.255.255.192 dev: table :main scope 0 type 6 index 0 scope 0 type 6 index 0
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.240.4 netmask: 255.255.255.255 dev: calic5e25250998 table :main scope 253 type 1 index 41 scope 253 type 1 index 41
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.240.37 netmask: 255.255.255.255 dev: cali21b37a164ee table :main scope 253 type 1 index 40 scope 253 type 1 index 40
sockperf: == version #3.7-no.git ==
sockperf: [SERVER] listen on:
[ 0] IP = 172.81.0.20 PORT = 15000 # TCP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: [tid 2813781] using recvfrom() to block on socket(s)
on client host 172.81.0.10
# LD_PRELOAD=libvma.so sockperf pp --tcp -i 172.81.0.20 -p 15000 -t 1
VMA INFO: ---------------------------------------------------------------------------
VMA INFO: VMA_VERSION: 9.8.31-1 Release built on Jul 10 2023 11:42:20
VMA INFO: Cmd Line: sockperf pp --tcp -i 172.81.0.20 -p 15000 -t 1
VMA INFO: OFED Version: MLNX_OFED_LINUX-23.07-0.5.1.2:
VMA INFO: ---------------------------------------------------------------------------
VMA INFO: Log Level INFO [VMA_TRACELEVEL]
VMA INFO: ---------------------------------------------------------------------------
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.81.0.100 netmask: 255.255.255.255 dev: veth29f76130861 table :500 scope 253 type 1 index 20 scope 253 type 1 index 20
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.81.0.120 netmask: 255.255.255.255 dev: veth35695ee5c1e table :500 scope 253 type 1 index 17 scope 253 type 1 index 17
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.82.0.111 netmask: 255.255.255.255 dev: veth47f77b93392 table :500 scope 253 type 1 index 22 scope 253 type 1 index 22
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.82.0.119 netmask: 255.255.255.255 dev: calic0d6e116972 table :500 scope 253 type 1 index 18 scope 253 type 1 index 18
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.90.0.100 netmask: 255.255.255.255 dev: veth4dd7b95a373 table :500 scope 253 type 1 index 21 scope 253 type 1 index 21
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.128 netmask: 255.255.255.192 dev: table :main scope 0 type 6 index 0 scope 0 type 6 index 0
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.131 netmask: 255.255.255.255 dev: cali00a8163a6e5 table :main scope 253 type 1 index 32 scope 253 type 1 index 32
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.146 netmask: 255.255.255.255 dev: cali7cef15e86e8 table :main scope 253 type 1 index 33 scope 253 type 1 index 33
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.164 netmask: 255.255.255.255 dev: calic0d6e116972 table :main scope 253 type 1 index 18 scope 253 type 1 index 18
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.182 netmask: 255.255.255.255 dev: cali2f01bce650e table :main scope 253 type 1 index 31 scope 253 type 1 index 31
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.183 netmask: 255.255.255.255 dev: calid7cf868faf9 table :main scope 253 type 1 index 19 scope 253 type 1 index 19
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.184 netmask: 255.255.255.255 dev: cali92710158f24 table :main scope 253 type 1 index 35 scope 253 type 1 index 35
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.185 netmask: 255.255.255.255 dev: calie84e04abcc4 table :main scope 253 type 1 index 34 scope 253 type 1 index 34
sockperf: == version #3.10-no.git ==
sockperf: ERROR: Can`t connect socket (errno=111 Connection refused)
and I succeed to run with libvma sometimes
# LD_PRELOAD=libvma.so sockperf pp --tcp -i 172.81.0.20 -p 15000 -t 1
VMA INFO: ---------------------------------------------------------------------------
VMA INFO: VMA_VERSION: 9.8.31-1 Release built on Jul 10 2023 11:42:20
VMA INFO: Cmd Line: sockperf pp --tcp -i 172.81.0.20 -p 15000 -t 1
VMA INFO: OFED Version: MLNX_OFED_LINUX-23.07-0.5.1.2:
VMA INFO: ---------------------------------------------------------------------------
VMA INFO: Log Level INFO [VMA_TRACELEVEL]
VMA INFO: ---------------------------------------------------------------------------
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.81.0.100 netmask: 255.255.255.255 dev: veth29f76130861 table :500 scope 253 type 1 index 20 scope 253 type 1 index 20
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.81.0.120 netmask: 255.255.255.255 dev: veth35695ee5c1e table :500 scope 253 type 1 index 17 scope 253 type 1 index 17
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.82.0.111 netmask: 255.255.255.255 dev: veth47f77b93392 table :500 scope 253 type 1 index 22 scope 253 type 1 index 22
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.82.0.119 netmask: 255.255.255.255 dev: calic0d6e116972 table :500 scope 253 type 1 index 18 scope 253 type 1 index 18
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.90.0.100 netmask: 255.255.255.255 dev: veth4dd7b95a373 table :500 scope 253 type 1 index 21 scope 253 type 1 index 21
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.128 netmask: 255.255.255.192 dev: table :main scope 0 type 6 index 0 scope 0 type 6 index 0
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.131 netmask: 255.255.255.255 dev: cali00a8163a6e5 table :main scope 253 type 1 index 32 scope 253 type 1 index 32
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.146 netmask: 255.255.255.255 dev: cali7cef15e86e8 table :main scope 253 type 1 index 33 scope 253 type 1 index 33
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.164 netmask: 255.255.255.255 dev: calic0d6e116972 table :main scope 253 type 1 index 18 scope 253 type 1 index 18
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.182 netmask: 255.255.255.255 dev: cali2f01bce650e table :main scope 253 type 1 index 31 scope 253 type 1 index 31
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.183 netmask: 255.255.255.255 dev: calid7cf868faf9 table :main scope 253 type 1 index 19 scope 253 type 1 index 19
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.184 netmask: 255.255.255.255 dev: cali92710158f24 table :main scope 253 type 1 index 35 scope 253 type 1 index 35
VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.185 netmask: 255.255.255.255 dev: calie84e04abcc4 table :main scope 253 type 1 index 34 scope 253 type 1 index 34
sockperf: == version #3.10-no.git ==
sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)
[ 0] IP = 172.81.0.20 PORT = 15000 # TCP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=1.000 sec; Warm up time=400 msec; SentMessages=103730; ReceivedMessages=103729
sockperf: ========= Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=0.550 sec; SentMessages=57262; ReceivedMessages=57262
sockperf: ====> avg-latency=4.778 (std-dev=1.034, mean-ad=0.218, median-ad=0.076, siqr=0.051, cv=0.216, std-error=0.004, 99.0% ci=[4.767, 4.789])
sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
sockperf: Summary: Latency is 4.778 usec
sockperf: Total 57262 observations; each percentile contains 572.62 observations
sockperf: ---> <MAX> observation = 132.789
sockperf: ---> percentile 99.999 = 102.016
sockperf: ---> percentile 99.990 = 13.302
sockperf: ---> percentile 99.900 = 12.760
sockperf: ---> percentile 99.000 = 9.170
sockperf: ---> percentile 90.000 = 4.808
sockperf: ---> percentile 75.000 = 4.720
sockperf: ---> percentile 50.000 = 4.673
sockperf: ---> percentile 25.000 = 4.616
sockperf: ---> <MIN> observation = 4.278
detailed log is attached for the client host fail-client-log.txt
In brief, when use libvma, it succeed sometimes and fail sometime with same command but when does not use libvma, it succeed all the time