crusader icon indicating copy to clipboard operation
crusader copied to clipboard

Need help explaining Crusader plots

Open richb-hanover opened this issue 1 year ago • 6 comments

@Zoxc @dtaht - I have a bunch of questions about what Crusader plots actually show, and how to explain lousy tests.

This is a first cut for a possible page for the Crusader repo. I would love to get your comments on the following - I'll keep tweaking up this article until it's "good enough". Thanks.


Comparison of Crusader Plots

A Crusader test packs a lot of data into a simple display. Here's how to understand what it's showing:

Over Ethernet

Mac mini M2 Ethernet to RPi4-plot 2024-08-18 21 28 06

This is a plot of a pretty good test. It was performed between two devices on Ethernet - an M2 Mac mini running Crusader GUI connected to a Crusader Server running on a Raspberry Pi 4. Here is what it shows:

  • The Download and Upload traces (green and blue, respectively) show throughput of nearly 1,000 mbps (1 Gbps), the rated values for the device interfaces.
  • The Combined plot shows transfers in both directions (green and blue) running at nearly their full rate. The purple trace is the sum of both.
  • The Download latency is a bit spiky, with most values ranging between 5 and 25 msec. (Can anyone say why?)
  • The Upload latency is quite low: well under 5 msec.
  • The Combined latency plot (purple) shows the sum of the latency in each direction (green, blue).
  • There appears to be a single packet dropped around 13.2 seconds in the bottom plot.

Over Wi-Fi from Living Room

MacBook in Living Room to RPi4-plot 2024-08-19 16 48 24

This is a mess. That is, the Crusader test shows my Wi-Fi network is a wreck.

It's a test run from an Intel MacBook with Crusader GUI over Wi-Fi, tested against the same Raspberry Pi4 running the server. (My router is a Belkin RT3200 with OpenWrt 22.03.5. I am currently afraid to upgrade to 23.05.4 - search the forums for "OKD") I see:

  • The Download throughput is low - 65-80 mbps, but not smooth
  • The Upload is worse - about 20mbps.
  • The Combined plot shows serious decrease in rates, likely because the packet loss and significant bufferbloat/latency increase interferes with the TCP connections.
  • Latency for download is higher than expected, but still less than 100msec
  • Latency for upload gets very high - averaging 250msec; peaking to 500msec.
  • The Packet Loss chart shows occasional packet loss during the Download; a large burst of packet loss during the Upload, and continuing packet loss for the Combined test.
  • What other observations are there from this plot?

WiFi from Dining Room

Macbook in Dining Room to RPi4-plot 2024-08-19 11 26 04

This looks a little better...

  • Download is about 100mbps, but still spiky.
  • Upload is a little higher, perhaps about 150mbps. However there is a pattern of regular latency drops, which are currently unexplained.
  • The Combined throughput plot shows the sum (purple) about the same as the Upload throughput, with the Download (green) considerably lower than when downloading alone.
  • Latency for all plots is lower than the previous chart, but still peaks over 150msec numerous times.

FAQs

  • How can I run the Crusader Server? You need to use another computer connected via Ethernet.
    • The examples above use a Raspberry Pi.
    • You could also run the Crusader Server on an Ethernet-connected computer, either building the software locally, or running it in a Docker Container. (This is becoming remarkably easy.)
    • On the router if it's fast enough. High-end routers, especially those that support Docker containers, probably have enough CPU power both to source and route packets at the full rate. (How could I tell if it's fast enough?)
  • Why is that second test so terrible? What can I do about it?

richb-hanover avatar Aug 19 '24 14:08 richb-hanover

Over Ethernet

  • The "Both" shows transfers in both directions running at nearly their full rate. (Is it the sum of the two directions?)

It is the sum.

  • The Download latency is a bit spiky, with most values ranging between 5 and 25 msec. (Can anyone say why?)

Possibly some bufferbloat on your Pi. Is it running fq-codel?

  • The "Both" latency is ???? (not sure how to describe it - is it the sum of the up and down latency? Something else?)

There's no "Both" latency. It just the latency measured during the "Both" load test. "Total latency" would be the sum of up and down.

Over Wi-Fi from Living Room

  • Why is the "Both" throughput pulled down to close to the Upload?

The packet loss and massive bufferbloat is probably interfering with the download TCP connections.

  • What other observations are there from this plot?

There's some regular latency spikes which looks like the station being busy scanning or unable to send data for some reason.

WiFi from Dining Room

There's still some regular latency spikes here, not sure about those.

Other avenues for explanation

Can I run the Crusader Server on my router?

That's reasonable if you're testing another device, say an AP, and the router can provide enough bandwidth. MikroTik's RouterOS has container support and Mikrotik RB4011 seems to be able to send around 5 Gbps.

Zoxc avatar Aug 20 '24 10:08 Zoxc

@Zoxc Thanks for your comments. I have updated the info above with them. MORE COMMENTS, PLEASE :-)

You asked: Is the Pi running fq_codel?

I don't know for sure. Here's the output of the tc command:

deploy@rpi4:~$ sudo tc qdisc show dev eth0
[sudo] password for deploy:
qdisc mq 0: root
qdisc fq_codel 0: parent :5 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :3 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64

I don't really understand this stuff, but it appears that the root qdisc is "mq", with five child qdisc's of fq_codel. (This is the default on Ubuntu 22.04 that's running on the RPi.) Would it make a difference to remove all the child qdisc's and just install fq_codel on the root? Thanks again

richb-hanover avatar Aug 20 '24 15:08 richb-hanover

Here's the output of the tc command:

That looks reasonable. It's possible the Pi's network drivers don't use BQL so there could be some bufferbloat in them still.

Zoxc avatar Aug 20 '24 16:08 Zoxc

Hmmm... That doesn't seem likely:

  1. It's Ubuntu 22.04 - pretty modern
  2. ll /sys/class/net/eth0/queues/tx-0/ shows:
deploy@rpi4:~$ ll /sys/class/net/eth0/queues/tx-0/
total 0
drwxr-xr-x 3 root root    0 Aug 20 14:04 ./
drwxr-xr-x 8 root root    0 Aug 20 14:04 ../
drwxr-xr-x 2 root root    0 Aug 20 14:04 byte_queue_limits/
-r--r--r-- 1 root root 4096 Aug 20 14:05 traffic_class
-rw-r--r-- 1 root root 4096 Aug 20 14:05 tx_maxrate
-r--r--r-- 1 root root 4096 Aug 20 14:05 tx_timeout
-rw-r--r-- 1 root root 4096 Aug 20 14:05 xps_cpus
-rw-r--r-- 1 root root 4096 Aug 20 14:05 xps_rxqs

Maybe @dtaht has some thoughts...

richb-hanover avatar Aug 20 '24 18:08 richb-hanover

PS My new router arrived today (GL.iNet MT6000) and I'll run the tests against its stock firmware and current OpenWrt soon

richb-hanover avatar Aug 20 '24 18:08 richb-hanover

I added https://github.com/Zoxc/crusader/discussions/categories/wi-fi-routers-and-access-points so we can collect results for specific devices there.

Zoxc avatar Aug 21 '24 10:08 Zoxc

I am so looking forward to a comparison with the mt6000!

dtaht avatar Aug 24 '24 17:08 dtaht

@dtaht - I ran a test with the Docker container running on the MT6000. Performance was pretty bad. See https://github.com/Zoxc/crusader/discussions/43

But I now realize that I can run the pre-built binary on that device. That experiment is queued up for this weekend.

richb-hanover avatar Aug 24 '24 17:08 richb-hanover

Let's see if the 0.3 docs are good enough

richb-hanover avatar Sep 19 '24 01:09 richb-hanover