http4s-armeria icon indicating copy to clipboard operation
http4s-armeria copied to clipboard

Armeria server performance analaysis

Open ikhoon opened this issue 5 years ago • 9 comments

A benchmark and performance analysis would be good data to advocate Armeria server to http4s' users.

ikhoon avatar Aug 05 '20 15:08 ikhoon

@hamnis set up a server-example project, which we've used to run some crude benchmarks between the various backends. It doesn't show off much yet, but it gives us a baseline for pings.

rossabaker avatar Aug 07 '20 03:08 rossabaker

1st Benchmark:

  • Environment: MacBook Pro, 2.2 GHz 6-Core Intel Core i7
  • Benchmark code: https://github.com/hamnis/http4s-server-example/commit/0cfeee3e4c4fd965f23f04f40cdf8a70981285af
  • Result
    • http4s-armeria
      $ wrk -t12 -c400 -d30s http://127.0.0.1:8080/hello
      
      Running 30s test @ http://127.0.0.1:8080/hello
        12 threads and 400 connections
        Thread Stats   Avg      Stdev     Max   +/- Stdev
          Latency    11.76ms    2.97ms  86.66ms   89.53%
            Req/Sec     1.71k     0.94k    3.64k    52.50%
        611170 requests in 30.02s, 107.62MB read
        Socket errors: connect 155, read 170, write 0, timeout 0
      Requests/sec:  20358.15
      Transfer/sec:      3.58MB
      
    • Blaze
      $ wrk -t12 -c400 -d30s http://127.0.0.1:8080/hello
      Running 30s test @ http://127.0.0.1:8080/hello
        12 threads and 400 connections
        Thread Stats   Avg      Stdev     Max   +/- Stdev
          Latency     2.94ms    2.19ms 107.06ms   96.86%
          Req/Sec     6.99k     3.06k   14.33k    53.22%
        2506706 requests in 30.03s, 351.81MB read
        Socket errors: connect 155, read 277, write 0, timeout 0
      Requests/sec:  83481.35
      Transfer/sec:     11.72MB
      
    • Plain Armeria
      $ wrk -t12 -c400 -d30s http://127.0.0.1:8080/hello
      
      Running 30s test @ http://127.0.0.1:8080/hello
        12 threads and 400 connections
        Thread Stats   Avg      Stdev     Max   +/- Stdev
          Latency     3.16ms  450.83us  28.96ms   87.82%
          Req/Sec     6.37k     3.29k   13.52k    47.94%
        2280593 requests in 30.01s, 401.54MB read
        Socket errors: connect 155, read 142, write 0, timeout 0
      Requests/sec:  75992.98
      Transfer/sec:     13.38MB
      
  • Bottleneck
    • http4s-armeria consumes all CPU cycles, it hits 100%
    • http4s-blaze hits: 50~60%
  • Analysis
    • The performance(Requests/sec) of http4s-armeria was increased 3~4 times by removing toUnicastPublisher operation which converts Stream[F, HttpObject] to Publihser[HttpObject]
  • Conclusion
    • We cannot release the initial version of http4s-armeria until solving the performance problem on the high CPU utilization.
    • We might need an obtimized converter for Reactive Streams?
    • ???

ikhoon avatar Aug 30 '20 13:08 ikhoon

Is it resolved? Congrats on the new release!

ngbinh avatar Feb 02 '21 02:02 ngbinh

Is it resolved?

I think there is a room we can optimize http4s-armeria performance. 💪

Loads of the benchmark were a simple "hello world" text message with HTTP/1.1. Because of the simple load, the bottleneck point which mostly consumes CPU was a conversion from Reactive Streams to fs2 Stream and vise versa. fs2-reactive-streams is used for the conversion.

I only got rid of a conversion that converts from fs2 streams to Reactive Streams for HttpResponse. Another remaining conversion is HttpRequest(Reactive Streams) to FS2 streams using fs2-reactive-streams. I am going to remove fs2-reactive-streams dependency and implement an optimized version for fs2.

ikhoon avatar Feb 02 '21 12:02 ikhoon

Thanks for the explanation. Looking forward to it!

ngbinh avatar Feb 02 '21 14:02 ngbinh

Is it resolved?

I think there is a room we can optimize http4s-armeria performance. 💪

Loads of the benchmark were a simple "hello world" text message with HTTP/1.1. Because of the simple load, the bottleneck point which mostly consumes CPU was a conversion from Reactive Streams to fs2 Stream and vise versa. fs2-reactive-streams is used for the conversion.

I only got rid of a conversion that converts from fs2 streams to Reactive Streams for HttpResponse. Another remaining conversion is HttpRequest(Reactive Streams) to FS2 streams using fs2-reactive-streams. I am going to remove fs2-reactive-streams dependency and implement an optimized version for fs2.

@ikhoon was the optimization to fs2 done?

andyczerwonka avatar Sep 15 '22 22:09 andyczerwonka

It was partially done and is still in progress. But there were many performance optimizations on the Armeria side. I strongly believe that http4s-armeria can be used in production.

ikhoon avatar Sep 16 '22 00:09 ikhoon

@ikhoon It's been two years since you have done the benchmark. It'd be nice if you can run it with the new http4s-armeria/blaze. I believe some numbers must change in the benchmark results.

danicheg avatar Sep 16 '22 08:09 danicheg

Agreed. Benchmark results might be changed somehow. I will run benchmarks soon.

ikhoon avatar Sep 19 '22 01:09 ikhoon