openwisp-monitoring icon indicating copy to clipboard operation
openwisp-monitoring copied to clipboard

[monitoring/checks] Add iperf check

Open nemesifier opened this issue 4 years ago • 7 comments

The goal of this project is to add a bandwidth test using iperf3, using the active check mechanism of openwisp monitoring.

The use case is to perform periodic bandwidth test to measure the max bandwidth available (TCP test) and jitter (UDP).

On a macro level, the check would work this way:

  1. openwisp connects to the device (only 1 check per device at time) via SSH and launches iperf3 as a client, first in TCP mode, then in UDP mode, the server is configured via setting, using the -j flag to obtain json output
  2. the collected data is parsed and stored as a metric (bandwidth information and jitter)
  3. SSH connection is closed

The steps to do are more or less the following:

  • create iperf check class, the check must use the connection module of openwisp-controller to connect to devices using SSH
  • If a device has no active Connection the check will be skipped and a warning logged
  • this check should be optional and disabled by default
  • we can run it by default every night
  • allow configuring the iperf server globally and by organization with a setting, eg:
OPENWISP_MONITORING_IPERF_SERVERS = {
    '': '<DEFAULT_IPERF_SERVER_HERE>',
    '<org-pk>': <ORG_IPERF_SERVER>'
}
  • we have to implement a lock to allow only 1 iperf check per server at time: https://docs.celeryproject.org/en/latest/tutorials/task-cookbook.html#ensuring-a-task-is-only-executed-one-at-a-timehttps://stackoverflow.com/questions/12003221/celery-task-schedule-ensuring-a-task-is-only-executed-one-at-a-time/12003293 that is: for every server available, only 1 check can be performed at a time, so the lock has to take this account when calculating the cache-key
  • ssh into device, launch iperf TCP client, repeat for UDP, collect data of both tests in a data structure
  • handle failures, if server is down, we can store 0, which would trigger an alert (threshold)
  • get or create metric
  • get or create chart
  • get or create threshold
  • save data (tcp max bandwidth, UDP jitter)
  • document this check

nemesifier avatar Jun 08 '20 17:06 nemesifier

Hi @nemesisdesign My name is Saurabh Mokashi, a CSE Undergrad from NITK, India. I am interested to contribute to this project, so kindly guide me through the materials and resources required.

Saurabh-Mokashi avatar Mar 13 '22 15:03 Saurabh-Mokashi

Please use the development chat to ask generic questions and let's keep this for technical and specific question. Thank you.

nemesifier avatar Mar 14 '22 13:03 nemesifier

I'd suggest using the low duty cycle bounceback test found in iperf 2.

'iperf -c --bounceback --hide-ips --permit-key=openwisptest

Client connecting to (hidden), TCP port 5001 with pid 331523 (1 flows) Write buffer size: 100 Byte Bursting: 100 Byte writes 10 times every 1.00 second(s) Bounce-back test (size= 100 Byte) (server hold req=0 usecs & tcp_quickack) TOS set to 0x0 and nodelay (Nagle off) TCP window size: 16.0 KByte (default)

[openwisptest(1)] local ...114%wlan0 port 39894 connected with ...123 port 5001 (bb w/quickack len/hold=100/0) (sock=3) (icwnd/mss/irtt=14/1448/11741) (ct=11.91 ms) on 2022-08-11 03:28:41 (UTC) [ ID] Interval Transfer Bandwidth BB cnt=avg/min/max/stdev Rtry Cwnd/RTT RPS [openwisptest(1)] 0.00-1.00 sec 1.95 KBytes 16.0 Kbits/sec 10=14.909/11.397/19.917/2.765 ms 0 14K/13585 us 66 rps [openwisptest(1)] 1.00-2.00 sec 1.95 KBytes 16.0 Kbits/sec 10=14.978/11.950/21.678/3.158 ms 0 14K/14738 us 66 rps [openwisptest(1)] 2.00-3.00 sec 1.95 KBytes 16.0 Kbits/sec 10=12.047/11.164/15.055/1.237 ms 0 14K/12591 us 82 rps [openwisptest(1)] 3.00-4.00 sec 1.95 KBytes 16.0 Kbits/sec 10=13.303/11.212/16.792/2.108 ms 0 14K/12856 us 75 rps [openwisptest(1)] 4.00-5.00 sec 1.95 KBytes 16.0 Kbits/sec 10=12.266/11.029/15.480/1.470 ms 0 14K/12199 us 81 rps [openwisptest(1)] 5.00-6.00 sec 1.95 KBytes 16.0 Kbits/sec 10=12.155/10.879/16.140/1.590 ms 0 14K/11997 us 82 rps [openwisptest(1)] 6.00-7.00 sec 1.95 KBytes 16.0 Kbits/sec 10=12.769/11.145/17.341/2.149 ms 0 14K/12673 us 78 rps [openwisptest(1)] 7.00-8.00 sec 1.95 KBytes 16.0 Kbits/sec 10=12.331/11.108/16.011/1.943 ms 0 14K/12103 us 80 rps [openwisptest(1)] 8.00-9.00 sec 1.95 KBytes 16.0 Kbits/sec 10=12.195/10.746/16.130/1.858 ms 0 14K/11998 us 81 rps [openwisptest(1)] 9.00-10.00 sec 1.95 KBytes 16.0 Kbits/sec 10=13.315/11.021/16.167/2.096 ms 0 14K/13124 us 74 rps [openwisptest(1)] 0.00-10.05 sec 19.5 KBytes 15.9 Kbits/sec 100=13.027/10.746/21.678/2.272 ms 0 14K/13246 us 76 rps [ 1] 0.00-10.05 sec BB8(f)-PDF: bin(w=100us):cnt(100)=108:1,109:1,110:1,111:2,112:9,113:10,114:4,115:7,116:6,117:1,118:3,119:5,120:3,121:2,122:1,123:1,124:1,127:4,128:4,129:1,130:1,133:1,135:2,138:1,144:1,151:2,152:1,155:3,156:1,157:1,158:2,160:4,161:2,162:3,164:1,166:1,168:1,174:1,175:2,200:1,217:1 (5.00/95.00/99.7%=111/174/217,Outliers=0,obl/obu=0/0) `

rjmcmahon avatar Aug 11 '22 03:08 rjmcmahon

@rjmcmahon thanks for the suggestion but we're using iperf3, is anything equivalent implemented in iperf3 too?

nemesifier avatar Aug 11 '22 08:08 nemesifier

Iperf 3 is poorly named. It's designed to measure networks that support large data sets, CERN to DoE sites. Those testing WiFi would benefit by using iperf 2. TXOPs are a major limiting factor to WiFi responsiveness so the tool used should test more than link capacity.

Comparison chart

rjmcmahon avatar Aug 11 '22 17:08 rjmcmahon

Iperf 3 is poorly named. It's designed to measure networks that support large data sets, CERN to DoE sites. Those testing WiFi would benefit by using iperf 2. TXOPs are a major limiting factor to WiFi responsiveness so the tool used should test more than link capacity.

Comparison chart

The main problem with iperf2 is the lack of JSON output for parsing its results. CSV could be used, but it would have to be seen how cumbersome that would be. Once iperf3 is implemented we can find out how much effort would be required to support iperf2 too.

Thanks for the precious info, I don't know those details about the history of iperf2 vs iperf3. Some people would also prefer to use speedtest, so I think that to find the right tools which can satisfy all needs we'll require an ongoing discussion and several iterations, but it will take some time to find the resources.

nemesifier avatar Aug 11 '22 18:08 nemesifier

parsing is fairly easy to do with python regexp. There are examples in the flows directory. Flows is also a nice wrapper layer that can be used to feed statistics & plotting. One can convert dictionaries to JSON if needed using standard python json libraries.

`self.regex_traffic = re.compile(r'[\s+\d+] (?P.*) sec\s+(?P\d+) Bytes\s+(?P\d+) bits/sec\s+(?P\d+)/(?P\d+)\s+(?P\d+)\s+(?P\d+)K/(?P\d+) us')'

A major point is that latency/responsiveness needs to be monitored.

rjmcmahon avatar Aug 11 '22 18:08 rjmcmahon

Solved and Closed by : https://github.com/openwisp/openwisp-monitoring/pull/447

Aryamanz29 avatar Oct 20 '22 17:10 Aryamanz29