webwire-go icon indicating copy to clipboard operation
webwire-go copied to clipboard

Performance Benchmarking

Open KernelPryanic opened this issue 7 years ago • 3 comments

Are there already any performance benchmarking results available?

KernelPryanic avatar Apr 02 '18 17:04 KernelPryanic

I just published a small webwire benchmarking tool.

  1. Start the test server: go run test-server.go
  2. Run the benchmark: go run benchmark.go

Following parameters are available:

  • bench-dur: benchmark duration in seconds (default: 10)
  • addr: address of the target test server like localhost:80 (default: :8081)
  • clients: number of concurrent clients (default: 10)
  • req-timeo: default request timeout (default: 10000)
  • min-req-itv: min interval between each request in milliseconds (default: 250)
  • max-req-itv: max interval between each request in milliseconds (default: 500)
  • min-pld-sz: min request payload size in bytes (default: 32)
  • max-pld-sz: max request payload size in bytes (default: 128)

Here's an example of a 60 seconds long benchmark with 1,000 concurrent connections each sending requests with a 1 KiB payload in a 10 to 30 milliseconds interval:

go run benchmark.go -clients 1000 -min-req-itv 10 -max-req-itv 30 -min-pld-sz 1024 -max-pld-sz 1024 -req-timeo 60000 -bench-dur 60

And here's the results of the above benchmark:

2018/04/02 21:20:19   Benchmark finished (60s)

  Requests performed:  1892900
  Requests timed out:  0

  Data sent:           1.81 GiB (1938329600 bytes)
  Data received:       1.81 GiB (1938329600 bytes)
  Avg payload size:    1.00 KiB

  Avg req itv:         19.955008ms
  Max req itv:         29ms
  Min req itv:         10ms

  Avg req time:        9.420078ms
  Max req time:        832.1403ms
  Min req time:        1.0004ms

  Req/s:               31548
  Bytes/s:             32305493
  Throughput:          30.81 MiB/s

System: I7 3930K hexa-core @ 3.8 Ghz; 64,0 GB DDR3 RAM @ 1833 Mhz

As you can see I was currently able to achieve around 31,5k requests per second with an average reply time of 9 milliseconds at 1k concurrent clients

romshark avatar Apr 02 '18 19:04 romshark

Beware

The benchmark is running amok on Windows 10 in case of many concurrent connections.

Windows 10

It seems like TCP/IP connection establishment is very slow on Windows causing huge problems when creating many concurrent connections (> 1000). Too many connections are invoking ridiculously many syscalls on Windows resulting in the Go runtime spawning thousands of OS threads because of syscall-blocked goroutines rendering the machine unresponsive when reaching 10k threads.

trace_benchmark_windows10

In the above screenshot, trace demonstrates the ridiculous amount of syscalls, the slowly degrading performance and the ever growing number of spawned OS threads.

MacOS High Sierra

I've also tested the same configuration on MacOS High Sierra getting very different results:

trace_benchmark_macos_highsierra

The Mac performed just fine with only 27 OS threads. No degrading performance, no syscall spam.

Conclusion

It look more like a Windows related problem rather than a WebWire server/client problem.

romshark avatar Apr 03 '18 22:04 romshark

I performed a load test using the latest revision and got the following results:

Results

Concurrent Connections 10.000
Request Payload 1 - 64 KiB
Requests Performed 5.919.046
Timeout Rate 0.00%
Sent 183.44 GiB
Received 183.44 GiB
Throughput 313.07 MiB/s
Requests per Second 9.865 rps
Average Latency 1 millisecond
Maximum Latency 4,23 seconds

Test System

Intel i7 3930K (12 threads @ 3.8Ghz, reached full load at 72°C) 64 GB DDR3 1833 Mhz (around 4,75 GB were used during the benchmark)

Consider that both the benchmark and server ran on this machine distorting the results, which could potentially be higher if those were run on different servers.

romshark avatar Jun 29 '18 01:06 romshark