liboping
liboping copied to clipboard
Results when pinging multiple hosts are not independent
Hi,
I reported an issue a few years ago against collectd because I noticed the ping results where highly correlated to each other while independent results were expected. I now noticed the same issue is present using the oping CLI tool.
Especially in the case where some hosts are unreachable, the result of the reachable hosts start deviating a lot.
oping results:
$ oping 172.16.2.1 193.x.x.x 213.x.x.x 1 .2.3.4 4.5.6.7
56 bytes from 172.16.2.1 (172.16.2.1): icmp_seq=296 ttl=64 time=1,14 ms
56 bytes from 193.x.x.x (193.x.x.x): icmp_seq=296 ttl=55 time=12,70 ms
56 bytes from 213.x.x.x (213.x.x.x): icmp_seq=296 ttl=51 time=15,55 ms
echo reply from 1.2.3.4 (1.2.3.4): icmp_seq=296 timeout
echo reply from 4.5.6.7 (4.5.6.7): icmp_seq=296 timeout
56 bytes from 172.16.2.1 (172.16.2.1): icmp_seq=297 ttl=64 time=1,28 ms
56 bytes from 193.x.x.x (193.x.x.x): icmp_seq=297 ttl=55 time=14,21 ms
56 bytes from 213.x.x.x (213.x.x.x): icmp_seq=297 ttl=51 time=17,23 ms
echo reply from 1.2.3.4 (1.2.3.4): icmp_seq=297 timeout
echo reply from 4.5.6.7 (4.5.6.7): icmp_seq=297 timeout
ping running in parallel with the above oping:
$ ping 172.16.2.1
PING 172.16.2.1 (172.16.2.1) 56(84) bytes of data.
64 bytes from 172.16.2.1: icmp_seq=1 ttl=64 time=0.674 ms
64 bytes from 172.16.2.1: icmp_seq=2 ttl=64 time=0.597 ms
64 bytes from 172.16.2.1: icmp_seq=3 ttl=64 time=0.624 ms
64 bytes from 172.16.2.1: icmp_seq=4 ttl=64 time=0.606 ms
64 bytes from 172.16.2.1: icmp_seq=5 ttl=64 time=0.676 ms
64 bytes from 172.16.2.1: icmp_seq=6 ttl=64 time=0.719 ms
64 bytes from 172.16.2.1: icmp_seq=7 ttl=64 time=0.596 ms
The oping results for 172.16.2.1 are a factor 2 too high.
Same problem
$ oping 10.66.9.15 -c 5 PING 10.66.9.15 (10.66.9.15) 56 bytes of data. 56 bytes from 10.66.9.15 (10.66.9.15): icmp_seq=1 ttl=63 time=0.79 ms 56 bytes from 10.66.9.15 (10.66.9.15): icmp_seq=2 ttl=63 time=0.87 ms 56 bytes from 10.66.9.15 (10.66.9.15): icmp_seq=3 ttl=63 time=0.88 ms 56 bytes from 10.66.9.15 (10.66.9.15): icmp_seq=4 ttl=63 time=0.85 ms 56 bytes from 10.66.9.15 (10.66.9.15): icmp_seq=5 ttl=63 time=0.87 ms
$ oping 10.66.9.15 8.8.8.8 8.8.4.4 208.67.220.220 208.67.222.222 100.0.0.2 100.0.0.3
PING 10.66.9.15 (10.66.9.15) 56 bytes of data.
PING 8.8.8.8 (8.8.8.8) 56 bytes of data.
PING 8.8.4.4 (8.8.4.4) 56 bytes of data.
PING 208.67.220.220 (208.67.220.220) 56 bytes of data.
PING 208.67.222.222 (208.67.222.222) 56 bytes of data.
PING 100.0.0.2 (100.0.0.2) 56 bytes of data.
PING 100.0.0.3 (100.0.0.3) 56 bytes of data.
56 bytes from 10.66.9.15 (10.66.9.15): icmp_seq=1 ttl=63 time=3.68 ms
56 bytes from 8.8.8.8 (8.8.8.8): icmp_seq=1 ttl=112 time=15.46 ms
56 bytes from 8.8.4.4 (8.8.4.4): icmp_seq=1 ttl=117 time=24.81 ms
56 bytes from 208.67.220.220 (208.67.220.220): icmp_seq=1 ttl=50 time=49.47 ms
56 bytes from 208.67.222.222 (208.67.222.222): icmp_seq=1 ttl=50 time=51.63 ms
56 bytes from 100.0.0.2 (100.0.0.2): icmp_seq=1 ttl=47 time=169.97 ms
56 bytes from 100.0.0.3 (100.0.0.3): icmp_seq=1 ttl=47 time=179.70 ms
56 bytes from 10.66.9.15 (10.66.9.15): icmp_seq=2 ttl=63 time=3.54 ms
I can reliably reproduce the original problem with timeouts, by simply pinging more than 8 hosts with noping simultaneously. Doesn't matter if it's from the same process or not. Everything is reliable up to 8 hosts, after that, nothing is.