puma icon indicating copy to clipboard operation
puma copied to clipboard

Response refactor, increase requests-per-second

Open MSP-Greg opened this issue 3 years ago • 3 comments

Description

This PR refactors response generation, speeding up operations with various body types and sizes. Comparison using the the code from PR #2895:

──────────────────────────────────────────────────────────────────────────── tcp6
benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s tcp6 -Y

Body    ────────── req/sec ──────────   ────── req 50% times ──────── Master
 KB     array   chunk  string      io   array   chunk  string      io
1       13662    9887   13552    7799    0.71    1.30    0.69    1.46
10       5430    2618   12787    3902    2.06    4.88    0.94    2.90
100       835     337    9945     610   13.30   32.55    1.13   17.63
256       350     135    8421     247   30.67   79.37    1.28   42.64
512       188      66    6784     123   56.31  157.21    1.60   84.97
1024      101      33    3717      63  102.29  299.51    2.19  163.60
2048       53      17    2524      32  189.07  588.73    4.30  315.08
─────────────────────────────────────────────────────────────────────

Body    ────────── req/sec ──────────   ─────── req 50% times ─────── PR #2896
 KB     array   chunk  string      io   array   chunk  string      io
1       13985   13870   14013    9697   0.727   0.732   0.726   1.160
10      13715   13483   13717    9325   0.741   0.755   0.741   1.200
100     11400   10353   11172    9388   0.940   1.040   0.970   1.130
256      9105    7640    8549    9305   1.150   1.380   1.190   1.140
512      6759    5445    6776    8997   1.520   1.920   1.520   1.100
1024     4018    3348    4050    9015   1.860   2.740   2.660   1.150
2048     2340    1783    2438    8823   4.320   6.130   4.410   1.180
─────────────────────────────────────────────────────────────────────

There are significant speed increases for almost all body types/sizes, with the exception of 'single string' bodies, which show smaller increases.

The commits:

  • [io_buffer.rb - change to using StringIO](https://github.com/puma/puma/commit/8ca473f7b0c0da4d1a3bda06be0d9500e5dbe5f1) - allows IOBuffer to be used both as a string and as an IO. Previously it was subclassed from String.

  • request.rb - refactor - Split fast_write into fast_write_response and fast_write_str. Refactor. Compute array bodies' Content-Length only when it's not provided by app. See #2892 by @guilleiguaran. Only an Array responds to #to_ary, so it is used for the body type check.

  • client.rb - small refactor

All three of the below are small refactors

Please feel free to test these changes.

The branch at https://github.com/MSP-Greg/puma/tree/00-response-refactor-bench contains the code for both this PR and PR #2895.

Closes #2892, closes #2703

Your checklist for this pull request

  • [x] I have reviewed the guidelines for contributing to this repository.
  • [ ] I have added (or updated) appropriate tests if this PR fixes a bug or adds a feature.
  • [ ] My pull request is 100 lines added/removed or less so that it can be easily reviewed.
  • [ ] If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
  • [x] If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
  • [x] I have updated the documentation accordingly.
  • [x] All new and existing tests passed, including Rubocop.

MSP-Greg avatar Jun 14 '22 16:06 MSP-Greg

Below is data for ssl connections. Again, significant speed increases in most of the body types/sizes.

──────────────────────────────────────────────────────────────────────────── ssl
benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s ssl6 -Y

Body    ────────── req/sec ──────────   ─────── req 50% times ─────── Master
 KB     array   chunk  string      io   array   chunk  string      io
1       11090    7303   10931    7657    1.34    1.28    1.32    1.65
10       2267    2263    9755    3911    4.44    4.34    1.20    3.00
100       308     309    4151     673   38.56   38.93    2.96   16.38
256       125     123    2165     275   92.57   92.01    5.19   38.25
512        63      62    1191     145  179.65  178.39    9.07   75.12
1024       31      31     617      74  323.35  315.68   17.02  145.23
2048       15      15     320      38  632.93  646.78   32.60  264.60
─────────────────────────────────────────────────────────────────────

Body    ────────── req/sec ──────────   ─────── req 50% times ─────── PR #2896
 KB     array   chunk  string      io   array   chunk  string      io
1       13802   13492   13820    9576   0.839   0.850   0.838   1.220
10      13131   12708   13067    9077   0.890   0.910   0.890   1.270
100      3799    4312    4482    2772   2.590   2.400   2.320   3.810
256      2177    2029    2231    1460   4.690   4.970   4.560   7.080
512      1175    1110    1196     798   8.400   9.030   8.250  12.800
1024      611     576     637     416  16.040  17.220  15.680  24.320
2048      305     290     325     215  32.290  34.010  30.720  46.800
─────────────────────────────────────────────────────────────────────
ruby 3.2.0dev (2022-06-14T01:21:55Z master 048f14221c) +YJIT [x86_64-linux]

Using Windows WSL2/Ubuntu 22.04 with OpenSSL 3.0.2

MSP-Greg avatar Jun 14 '22 16:06 MSP-Greg

All of the above was done with Windows WSL2/Ubuntu 22.04. But, that's not native, essentially running in a VM.

So, I ran the benchmarks on GitHub Actions, which are 2 core VM's running on larger systems. They list several processors, I looked up two, one had 12 cores, another 32. I ran the test a few times to account for 'noisy neighbors'. Data below. Again, almost all response body type/size combinations run faster than current master...

wrk -t5 -c10 -d10s
benchmarks/local/response_time_wrk.sh -w2 -t5:5 -T5 -c2 -Y
Server cluster mode -w2 -t5:5, bind: tcp
ruby 3.2.0dev (2022-06-16T16:48:52Z master cd5cafa4a3) +YJIT [x86_64-linux]

═════════════════════════════════════════════════════════════════════ Master
Body    ────────── req/sec ──────────   ─────── req 50% times ───────
 KB     array   chunk  string      io   array   chunk  string      io
1       15190   13626   14527   12669   0.233   0.614   0.551   0.257
10      11129   10219   13308   10879   0.228   0.846   0.434   0.337
100      3454    3310    7736    4412   2.700   2.960   1.190   1.990
256      1595    1493    4513    2252   6.080   6.470   1.920   3.830
512       862     803    2626    1258  11.340  10.970   3.430   7.810
1024      439     419    1517     668  24.280  23.980   6.110  14.820
2048      221     213     733     345  44.230  47.190  11.850  26.920
─────────────────────────────────────────────────────────────────────

═════════════════════════════════════════════════════════════════════ PR #2896
Body    ────────── req/sec ──────────   ─────── req 50% times ───────
 KB     array   chunk  string      io   array   chunk  string      io
1       16005   15106   16250   13138   0.570   0.601   0.568   0.649
10      15634   14384   15077   12620   0.550   0.636   0.547   0.737
100      8499    7470    8817    7038   0.980   1.140   1.040   0.626
256      5215    4131    5281    6813   1.800   2.280   1.810   1.030
512      3133    2464    3155    5339   2.710   3.810   2.880   1.710
1024     1730    1336    1838    3419   5.160   6.540   5.120   2.800
2048      884     688     954    1942   9.760  11.880   9.760   4.860
─────────────────────────────────────────────────────────────────────
wrk latency data, grouped by response body size
═════════════════════════════════════════════════════════════════════    1kB
Type   req/sec    50%     75%     90%     99%    100%  Resp Size
───────────────────────────────────────────────────────────────── Master
array   15190    0.23    4.11    9.11   15.18   44.35      1022
chunk   13626    0.61    1.56    4.80   12.35   24.82      1040
string  14527    0.55    2.05    5.62   14.66   38.41      1022
io      12669    0.26    4.27    9.08   15.32   37.56      1022
───────────────────────────────────────────────────────────────── PR #2896
array   16005    0.57    0.65    0.96    3.27   40.51      1022
chunk   15106    0.60    0.72    1.00    2.71   41.34      1040
string  16250    0.57    0.64    0.87    2.37   18.10      1022
io      13138    0.65    0.88    1.59    4.47   27.82      1022

═════════════════════════════════════════════════════════════════════ 10kB Type req/sec 50% 75% 90% 99% 100% Resp Size ───────────────────────────────────────────────────────────────── Master array 11129 0.23 4.44 9.49 15.43 75.46 10318 chunk 10219 0.85 2.00 4.71 12.26 33.53 10318 string 13308 0.43 3.33 8.82 16.75 35.13 10238 io 10879 0.34 4.39 9.84 18.99 34.19 10238 ───────────────────────────────────────────────────────────────── PR #2896 array 15634 0.55 0.71 1.48 3.84 26.45 10239 chunk 14384 0.64 0.73 1.03 2.70 42.92 10318 string 15077 0.55 0.74 1.47 3.83 19.55 10238 io 12620 0.74 0.85 1.14 2.76 28.74 10239

═════════════════════════════════════════════════════════════════════ 100kB Type req/sec 50% 75% 90% 99% 100% Resp Size ───────────────────────────────────────────────────────────────── Master array 3454 2.70 3.27 3.88 6.41 38.69 103104 chunk 3310 2.96 3.14 3.57 5.69 32.04 103109 string 7736 1.19 1.39 1.79 4.99 42.35 102398 io 4412 1.99 2.50 3.91 10.31 31.69 102398 ───────────────────────────────────────────────────────────────── PR #2896 array 8499 0.98 1.48 2.44 7.07 21.90 102398 chunk 7470 1.14 1.53 2.49 6.84 33.93 103107 string 8817 1.04 1.19 1.56 3.99 43.69 102446 io 7038 0.63 17.34 33.80 44.19 63.60 102398

═════════════════════════════════════════════════════════════════════ 256kB Type req/sec 50% 75% 90% 99% 100% Resp Size ───────────────────────────────────────────────────────────────── Master array 1595 6.08 6.49 7.41 12.81 49.06 263945 chunk 1493 6.47 7.03 8.10 15.22 74.17 263944 string 4513 1.92 2.43 3.95 11.27 35.72 262248 io 2252 3.83 5.49 9.87 24.42 60.74 262139 ───────────────────────────────────────────────────────────────── PR #2896 array 5215 1.80 2.01 2.46 5.33 58.87 262138 chunk 4131 2.28 2.58 3.03 5.59 25.88 263972 string 5281 1.81 2.03 2.40 5.08 21.45 262155 io 6813 1.03 4.07 24.08 42.82 57.61 262139

═════════════════════════════════════════════════════════════════════ 512kB Type req/sec 50% 75% 90% 99% 100% Resp Size ───────────────────────────────────────────────────────────────── Master array 862 11.34 11.94 13.37 20.81 44.18 527877 chunk 803 10.97 13.85 20.43 76.32 114.16 527884 string 2626 3.43 4.19 6.00 15.99 40.05 524363 io 1258 7.81 10.53 14.66 25.06 64.31 524290 ───────────────────────────────────────────────────────────────── PR #2896 array 3133 2.71 3.63 6.26 16.88 37.30 524244 chunk 2464 3.81 4.41 5.39 9.98 27.48 528044 string 3155 2.88 3.34 4.83 13.47 39.83 524324 io 5339 1.71 2.29 12.73 41.39 69.71 524428

═════════════════════════════════════════════════════════════════════ 1024kB Type req/sec 50% 75% 90% 99% 100% Resp Size ───────────────────────────────────────────────────────────────── Master array 439 24.28 26.15 29.12 37.36 54.84 1055750 chunk 419 23.98 25.56 27.65 38.95 61.14 1055735 string 1517 6.11 7.29 9.77 15.31 39.23 1048590 io 668 14.82 17.03 19.49 29.18 52.12 1048582 ───────────────────────────────────────────────────────────────── PR #2896 array 1730 5.16 6.89 9.78 35.59 62.04 1048535 chunk 1336 6.54 8.45 13.55 40.03 90.03 1056056 string 1838 5.12 5.82 7.62 13.64 39.44 1048488 io 3419 2.80 3.45 4.86 38.38 61.40 1048474

═════════════════════════════════════════════════════════════════════ 2048kB Type req/sec 50% 75% 90% 99% 100% Resp Size ───────────────────────────────────────────────────────────────── Master array 221 44.23 47.96 52.12 120.97 347.25 2111458 chunk 213 47.19 51.63 55.62 64.85 95.46 2111497 string 733 11.85 17.93 27.84 64.10 147.48 2096385 io 345 26.92 31.42 43.47 240.98 314.45 2097178 ───────────────────────────────────────────────────────────────── PR #2896 array 884 9.76 14.96 24.83 60.45 105.23 2097201 chunk 688 11.88 18.85 42.21 99.54 154.52 2110244 string 954 9.76 11.24 13.92 24.41 69.11 2096501 io 1942 4.86 6.20 8.41 36.39 64.73 2097527

MSP-Greg avatar Jun 17 '22 21:06 MSP-Greg

Just some comments from the phone, haven't looked through the actual changes. Not sure when I'll be able to do that.

dentarg avatar Jun 18 '22 14:06 dentarg

@MSP-Greg I think is waiting on some variable renames but otherwise looks good to go?

nateberkopec avatar Sep 01 '22 06:09 nateberkopec

@nateberkopec working on this re PR #2892.

Very busy with a (paid) project fixing Ruby code I wrote 10+ years ago. I was pretty new to Ruby then, mostly did C# and js back then...

MSP-Greg avatar Sep 04 '22 12:09 MSP-Greg

Question: Should an empty response body return a Content-Length: 0 header in the response?

MSP-Greg avatar Sep 04 '22 17:09 MSP-Greg

That's a good question, HTTP allows it and some clients might want to know it (e.g for verification), but I'm not sure how common is having it for empty body responses.

guilleiguaran avatar Sep 04 '22 19:09 guilleiguaran

@guilleiguaran I updated this, adding checks for if the body is a File io without a Content-Length.

Thanks for all your help here and in PR #2892...

Also, please see https://github.com/puma/puma/pull/2896/commits/eb5d7a35ce8b, that's the code similar to your PR. I start with app_body, and convert it to body. close is called on app_body at the end of the response handling.

From what I can tell, Content-Length: 0 is allowed as a value, and it's passed in the response. Some tests needed changes for it...

I'm not sure how common is having it for empty body responses.

Now it's being set for a body that's either [] or ['']

MSP-Greg avatar Sep 04 '22 20:09 MSP-Greg

Rebased for Rack 3, added commit test_rack_server.rb - add tests for Rack::Chunked middleware

MSP-Greg avatar Sep 07 '22 23:09 MSP-Greg

I updated the timing data shown in the first post. This PR has always used IO.copy_stream when response bodies were 'IO like'. That was updated in master with PR #2923, but this PR seems to be a few percent faster. The larger improvements when bodies are arrays or enums still exist compared to master.

I need to squash some commits and I'll push with a current rebase on master.

MSP-Greg avatar Sep 10 '22 03:09 MSP-Greg

First, sorry the failure, mixed up a rebase. I hate it when that happens.

Now that the benchmark code is in master, if anyone can test this, it would be helpful, as locally I'm using WSL2/Ubuntu.

The 'default' test run takes about 5 minutes and generates all the data in the first post. You need wrk in your path. Build/compile Puma master, and run

# IPv6
benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s tcp6
# IPv4
benchmarks/local/response_time_wrk.sh -w2 -t5:5

This PR doesn't make any changes to the extension code, so you can switch to the PR branch and rerun the tests.

To run just a particular body type, the below example runs a 'file' body (i for io):

benchmarks/local/response_time_wrk.sh -w2 -t5:5 -bi

Thanks in advance.

MSP-Greg avatar Sep 11 '22 17:09 MSP-Greg

m1 macbook pro macos 12.5.1

master - ruby 3.1.0

Body    ────────── req/sec ──────────   ─────── req 50% times ───────
 KB     array   chunk  string      io   array   chunk  string      io
1       18342   19389   20278   14428    0.48    0.50    0.46    0.60
10      14419    9414   18349   14362    0.65    1.24    0.49    0.62
100      4674    1828   13650   14416    2.31    6.14    0.66    0.62
256      2104     834   11236   13639    5.59   14.79    0.81    0.66
512      1108     414    6643   10097   11.29   30.53    1.34    0.84
1024      563     195    4071    5918   23.03   65.35    2.37    1.45
2048      231      82    1932    2610   44.87  124.47    5.34    3.52
─────────────────────────────────────────────────────────────────────
wrk -t8 -c16 -d10s
benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s tcp6
Server cluster mode -w2 -t5:5, bind: tcp6
Puma repo branch master
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [arm64-darwin21]

this pr - ruby 3.1.0

Body    ────────── req/sec ──────────   ─────── req 50% times ───────
 KB     array   chunk  string      io   array   chunk  string      io
1       24542   23149   23006   15973   0.378   0.401   0.396   0.531
10      20703   19842   20825   16173   0.410   0.425   0.415   0.536
100     15954   13145   15236   14753   0.535   0.638   0.570   0.576
256     11202    8415   11325   13684   0.761   1.020   0.780   0.750
512      8226    5723    7644   11047   1.270   1.610   1.190   0.760
1024     4291    3046    4117    5309   1.920   2.870   2.600   1.590
2048     2424    1834    2368    2389   3.760   5.230   4.170   3.770
─────────────────────────────────────────────────────────────────────
wrk -t8 -c16 -d10s
benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s tcp6
Server cluster mode -w2 -t5:5, bind: tcp6
Puma repo branch 00-response-refactor
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [arm64-darwin21]

master - ruby 2.7.6

Body    ────────── req/sec ──────────   ─────── req 50% times ───────
 KB     array   chunk  string      io   array   chunk  string      io
1       21411   19572   10786   15265    0.43    0.48    0.92    0.58
10      15530   10352    9982   15690    0.59    0.93    0.93    0.55
100      4959    2014   10186   14373    2.04    5.77    0.93    0.62
256      2199     806   10029   13493    5.35   15.32    0.95    0.66
512      1213     401   10125   10550   10.39   30.90    0.93    0.80
1024      597     226   10119    5716   21.59   57.79    0.94    1.51
2048      285     116   10601    3103   44.15  105.27    0.92    2.89
─────────────────────────────────────────────────────────────────────

wrk -t8 -c16 -d10s
benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s tcp6
Server cluster mode -w2 -t5:5, bind: tcp6
Puma repo branch master
ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [arm64-darwin21]

this pr - ruby 2.7.6

Body    ────────── req/sec ──────────   ─────── req 50% times ───────
 KB     array   chunk  string      io   array   chunk  string      io
1       25633   26126   11252   18638   0.360   0.353   0.849   0.543
10      23257   23040   11272   18224   0.366   0.375   0.850   0.474
100     17578   15364   10246   15037   0.479   0.548   0.890   0.559
256     12490    9535   10391   14082   0.668   0.880   0.870   0.720
512      8315    6743   11556    9646   0.990   1.290   0.844   0.776
1024     4721    3806   10049    5196   1.650   2.230   0.920   1.560
2048     2526    2195   10145    2428   3.290   4.070   0.950   3.640
─────────────────────────────────────────────────────────────────────
wrk -t8 -c16 -d10s
benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s tcp6
Server cluster mode -w2 -t5:5, bind: tcp6
Puma repo branch 00-response-refactor
ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [arm64-darwin21]

agbaber avatar Sep 12 '22 19:09 agbaber

@agbaber

Thank you for the results, sorry they take so long. For the most part, your data tracks with mine, in that many of the speed increases are significant.

I'm wondering if something is wrong with the code, as the Ruby 2.7 string data is much different that Ruby 3.1. I've used Ruby master for most of my local testing.

Do you recall if latency summary section output before the main summary shown above listed the string body sizes as similar to the other sized bodies? Example from my output:

─────────────────────────────────────────────────────────────────────────────────── 2048kB
array      39   266.0   286.6   317.5  1510.0  1940.0   2097155
chunk      16   633.1   675.3   715.9  1370.0  1460.0   2111471
string   2627     3.9    15.1    33.2    51.2    77.8   2097409
io       7682     1.4     6.8    15.4    23.8    65.2   2097244

JFYI, the body size number isn't terribly accurate due to the limited digits in wrk's 'bytes read' output.

EDIT: I'm going to test on Ruby 2.7 locally.

MSP-Greg avatar Sep 12 '22 21:09 MSP-Greg

@agbaber Thanks again. I found the bug, Ruby 3 doesn't raise a "can't modify a frozen string" error, but Ruby 2.7 does. That was the cause of the odd string data in Ruby 2.7. Fixed in PR #2951. Rebased this, so it has the change also...

MSP-Greg avatar Sep 12 '22 23:09 MSP-Greg

Late to the party, but I tried this on Ruby 3.2.0 Preview 2. Following are the results.

My laptop information

ProductName:	macOS
ProductVersion:	12.4
BuildVersion:	21F79
ruby 3.2.0preview2 (2022-09-09 master 35cfc9a3bb) [arm64-darwin21] without YJIT

master - 0e832e71ef

$ benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s ssl6 -Y

Body    ────────── req/sec ──────────   ─────── req 50% times ───────
 KB     array   chunk  string      io   array   chunk  string      io
1       20293   17969   22868   18407    0.36    0.60    0.53    0.57
10      11181    4149   19329   16614    0.74    2.19    0.42    0.65
100      1036     236    8728    7782    8.91   69.89    1.03    1.37
256       905     272    4574    4199    7.71   21.60    1.89    2.37
512       431     149    2729    2326   15.20   41.19    3.55    4.01
1024      172      30    1428    1261   35.52  103.26    6.54    7.58
2048       47      16     759     632  155.84  646.02   12.67   15.15
─────────────────────────────────────────────────────────────────────
wrk -t8 -c16 -d10s
benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s ssl6 -Y
Server cluster mode -w2 -t5:5, bind: ssl6
Puma repo branch master
ruby 3.2.0preview2 (2022-09-09 master 35cfc9a3bb) [arm64-darwin21]
View Full Result
$ benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s ssl6 -Y

bundle exec bin/puma -q -b ssl://[::1]:40001?cert=examples/puma/cert_puma.pem&key=examples/puma/puma_keypair.pem&verify_mode=none -S tmp/bench_test_puma.state -w2 -t5:5 --control-url=tcp://127.0.0.1:40010 --control-token=test test/rackup/ci_select.ru

usage: sleep seconds
ruby: warning: Ruby was built without YJIT support
ruby: warning: Ruby was built without YJIT support
[14328] Puma starting in cluster mode...
[14328] * Puma version: 5.6.5 (ruby 3.2.0-p-1) ("Birdie's Version")
[14328] *  Min threads: 5
[14328] *  Max threads: 5
[14328] *  Environment: development
[14328] *   Master PID: 14328
[14328] *      Workers: 2
[14328] *     Restarts: (✔) hot (✔) phased
[14328] * Listening on ssl://[::1]:40001?cert=examples/puma/cert_puma.pem&key=examples/puma/puma_keypair.pem&verify_mode=none
[14328] Use Ctrl-C to stop
[14328] * Starting control server on http://127.0.0.1:40010
[14328] - Worker 1 (PID: 14333) booted in 0.0s, phase: 0
[14328] - Worker 0 (PID: 14332) booted in 0.01s, phase: 0

warm-up
wrk -t2 -c4 -d1s --latency -H 'Body-Conf: a10'   https://[::1]:40001 |  13918 requests in  1.10s, 159.46MB read
wrk -t2 -c4 -d1s --latency -H 'Body-Conf: c10'   https://[::1]:40001 |   6163 requests in  1.10s,  71.10MB read
wrk -t2 -c4 -d1s --latency -H 'Body-Conf: s10'   https://[::1]:40001 |  22974 requests in  1.10s, 263.20MB read
wrk -t2 -c4 -d1s --latency -H 'Body-Conf: i10'   https://[::1]:40001 |  18741 requests in  1.10s, 214.71MB read

wrk -t8 -c16 -d10s --latency -H 'Body-Conf: a1'    https://[::1]:40001 | 204979 requests in 10.10s, 546.71MB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: c1'    https://[::1]:40001 | 181543 requests in 10.10s, 487.25MB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: s1'    https://[::1]:40001 | 230961 requests in 10.10s, 615.87MB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: i1'    https://[::1]:40001 | 185924 requests in 10.10s, 495.81MB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: a10'   https://[::1]:40001 | 112923 requests in 10.10s,   1.26GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: c10'   https://[::1]:40001 |  41905 requests in 10.10s, 483.31MB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: s10'   https://[::1]:40001 | 195247 requests in 10.10s,   2.18GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: i10'   https://[::1]:40001 | 167796 requests in 10.10s,   1.88GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: a100'  https://[::1]:40001 |  10461 requests in 10.10s,   1.02GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: c100'  https://[::1]:40001 |   2371 requests in 10.03s, 237.69MB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: s100'  https://[::1]:40001 |  88165 requests in 10.10s,   8.55GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: i100'  https://[::1]:40001 |  78638 requests in 10.10s,   7.63GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: a256'  https://[::1]:40001 |   9139 requests in 10.10s,   2.25GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: c256'  https://[::1]:40001 |   2743 requests in 10.10s, 696.48MB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: s256'  https://[::1]:40001 |  46207 requests in 10.10s,  11.36GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: i256'  https://[::1]:40001 |  42425 requests in 10.10s,  10.43GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: a512'  https://[::1]:40001 |   4351 requests in 10.10s,   2.13GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: c512'  https://[::1]:40001 |   1509 requests in 10.10s, 764.44MB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: s512'  https://[::1]:40001 |  27568 requests in 10.10s,  13.51GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: i512'  https://[::1]:40001 |  23491 requests in 10.10s,  11.51GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: a1024' https://[::1]:40001 |   1733 requests in 10.10s,   1.70GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: c1024' https://[::1]:40001 |    298 requests in 10.10s, 304.86MB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: s1024' https://[::1]:40001 |  14428 requests in 10.10s,  14.12GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: i1024' https://[::1]:40001 |  12739 requests in 10.10s,  12.47GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: a2048' https://[::1]:40001 |    473 requests in 10.04s,   0.94GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: c2048' https://[::1]:40001 |    158 requests in 10.10s, 325.42MB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: s2048' https://[::1]:40001 |   7668 requests in 10.10s,  15.00GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: i2048' https://[::1]:40001 |   6388 requests in 10.10s,  12.50GB read

Type   req/sec    50%     75%     90%     99%    100%  Resp Size  Errors
───────────────────────────────────────────────────────────────────────────────────    1kB
array   20293     0.4     0.5     0.8     3.4    11.0      1023
chunk   17969     0.6     1.1     2.8     5.4    11.6      1040
string  22868     0.5     1.0     2.4     4.3    13.2      1022
io      18407     0.6     0.9     3.2     6.0    15.2      1022
───────────────────────────────────────────────────────────────────────────────────   10kB
array   11181     0.7     1.0     1.3     4.4    15.8     10239
chunk    4149     2.2     7.1    19.5    56.0    96.4     10319
string  19329     0.4     0.6     1.2     4.3    12.1     10238
io      16614     0.7     1.1     3.2     5.9    15.6     10239
───────────────────────────────────────────────────────────────────────────────────  100kB
array    1036     8.9    23.5    72.0   214.2   313.9    102444
chunk     236    69.9   100.9   275.5   496.6   638.2    103344
string   8728     1.0     1.4     2.5     9.4    21.4    102400
io       7782     1.4     2.2     8.0    14.6    28.2    102400
───────────────────────────────────────────────────────────────────────────────────  256kB
array     905     7.7    15.5    25.4    51.5   286.1    262277
chunk     272    21.6    64.8   103.7   337.2  1050.0    264470
string   4574     1.9     2.4     2.9     7.2    28.5    262109
io       4199     2.4     3.1     9.0    20.6    32.4    262148
───────────────────────────────────────────────────────────────────────────────────  512kB
array     431    15.2    37.5    59.6   155.7   473.8    524888
chunk     149    41.2   110.0   174.3   365.3  1010.0    529420
string   2729     3.6     4.3     7.1    36.7    47.4    524522
io       2326     4.0     4.7     5.4    12.3    54.6    524349
─────────────────────────────────────────────────────────────────────────────────── 1024kB
array     172    35.5   100.7   140.7   869.0  1710.0   1051747
chunk      30   103.3   708.9   799.9   913.7  1470.0   1070940
string   1428     6.5     7.9     9.6    26.2    67.5   1049204
io       1261     7.6     8.5     9.5    18.6    73.1   1048938
─────────────────────────────────────────────────────────────────────────────────── 2048kB
array      47   155.8   425.9   523.4  1080.0  1580.0   2121956
chunk      16   646.0   806.9   888.2  1110.0  1160.0   2157877
string    759    12.7    14.6    16.6    31.6   140.6   2098010
io        632    15.2    17.0    20.4    36.0   173.2   2099701

Body    ────────── req/sec ──────────   ─────── req 50% times ───────
 KB     array   chunk  string      io   array   chunk  string      io
1       20293   17969   22868   18407    0.36    0.60    0.53    0.57
10      11181    4149   19329   16614    0.74    2.19    0.42    0.65
100      1036     236    8728    7782    8.91   69.89    1.03    1.37
256       905     272    4574    4199    7.71   21.60    1.89    2.37
512       431     149    2729    2326   15.20   41.19    3.55    4.01
1024      172      30    1428    1261   35.52  103.26    6.54    7.58
2048       47      16     759     632  155.84  646.02   12.67   15.15
─────────────────────────────────────────────────────────────────────
wrk -t8 -c16 -d10s
benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s ssl6 -Y
Server cluster mode -w2 -t5:5, bind: ssl6
Puma repo branch master
ruby 3.2.0preview2 (2022-09-09 master 35cfc9a3bb) [arm64-darwin21]

[14328] - Gracefully shutting down workers...
[14328] === puma shutdown: 2022-09-13 20:58:46 +0900 ===
[14328] - Goodbye!

 5:14 Total Time

This PR - c32f8ff9b

$ benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s ssl6 -Y

Body    ────────── req/sec ──────────   ─────── req 50% times ───────
 KB     array   chunk  string      io   array   chunk  string      io
1       23522   24581   25445   21879   0.405   0.424   0.455   0.537
10      22458   21593   22811   19156   0.472   0.430   0.487   0.568
100      7772    7431    8768    6853   1.100   1.330   1.230   1.210
256      4089    3646    4731    4037   2.150   2.410   2.050   2.210
512      2303    2021    2474    2244   3.960   4.560   3.730   4.100
1024     1170    1075    1442    1262   7.930   8.840   6.510   7.570
2048      565     538     755     656  16.640  17.940  12.770  14.790
─────────────────────────────────────────────────────────────────────
wrk -t8 -c16 -d10s
benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s ssl6 -Y
Server cluster mode -w2 -t5:5, bind: ssl6
Puma repo branch 00-response-refactor
ruby 3.2.0preview2 (2022-09-09 master 35cfc9a3bb) [arm64-darwin21]
View Full Result
$ benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s ssl6 -Y

bundle exec bin/puma -q -b ssl://[::1]:40001?cert=examples/puma/cert_puma.pem&key=examples/puma/puma_keypair.pem&verify_mode=none -S tmp/bench_test_puma.state -w2 -t5:5 --control-url=tcp://127.0.0.1:40010 --control-token=test test/rackup/ci_select.ru

usage: sleep seconds
ruby: warning: Ruby was built without YJIT support
ruby: warning: Ruby was built without YJIT support
[13614] Puma starting in cluster mode...
[13614] * Puma version: 5.6.5 (ruby 3.2.0-p-1) ("Birdie's Version")
[13614] *  Min threads: 5
[13614] *  Max threads: 5
[13614] *  Environment: development
[13614] *   Master PID: 13614
[13614] *      Workers: 2
[13614] *     Restarts: (✔) hot (✔) phased
[13614] * Listening on ssl://[::1]:40001?cert=examples/puma/cert_puma.pem&key=examples/puma/puma_keypair.pem&verify_mode=none
[13614] Use Ctrl-C to stop
[13614] * Starting control server on http://127.0.0.1:40010
[13614] - Worker 0 (PID: 13628) booted in 0.01s, phase: 0
[13614] - Worker 1 (PID: 13629) booted in 0.01s, phase: 0

warm-up
wrk -t2 -c4 -d1s --latency -H 'Body-Conf: a10'   https://[::1]:40001 |  26562 requests in  1.10s, 304.31MB read
wrk -t2 -c4 -d1s --latency -H 'Body-Conf: c10'   https://[::1]:40001 |  26710 requests in  1.10s, 308.04MB read
wrk -t2 -c4 -d1s --latency -H 'Body-Conf: s10'   https://[::1]:40001 |  26873 requests in  1.10s, 307.87MB read
wrk -t2 -c4 -d1s --latency -H 'Body-Conf: i10'   https://[::1]:40001 |  21684 requests in  1.10s, 248.42MB read

wrk -t8 -c16 -d10s --latency -H 'Body-Conf: a1'    https://[::1]:40001 | 237588 requests in 10.10s, 633.59MB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: c1'    https://[::1]:40001 | 248277 requests in 10.10s, 666.33MB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: s1'    https://[::1]:40001 | 257024 requests in 10.10s, 685.37MB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: i1'    https://[::1]:40001 | 220993 requests in 10.10s, 589.28MB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: a10'   https://[::1]:40001 | 226841 requests in 10.10s,   2.54GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: c10'   https://[::1]:40001 | 218111 requests in 10.10s,   2.46GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: s10'   https://[::1]:40001 | 230415 requests in 10.10s,   2.58GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: i10'   https://[::1]:40001 | 193550 requests in 10.10s,   2.17GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: a100'  https://[::1]:40001 |  78517 requests in 10.10s,   7.62GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: c100'  https://[::1]:40001 |  75054 requests in 10.10s,   7.33GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: s100'  https://[::1]:40001 |  88591 requests in 10.10s,   8.60GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: i100'  https://[::1]:40001 |  69228 requests in 10.10s,   6.72GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: a256'  https://[::1]:40001 |  41302 requests in 10.10s,  10.15GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: c256'  https://[::1]:40001 |  36824 requests in 10.10s,   9.11GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: s256'  https://[::1]:40001 |  47792 requests in 10.10s,  11.75GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: i256'  https://[::1]:40001 |  40768 requests in 10.10s,  10.02GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: a512'  https://[::1]:40001 |  23257 requests in 10.10s,  11.40GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: c512'  https://[::1]:40001 |  20412 requests in 10.10s,  10.07GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: s512'  https://[::1]:40001 |  24992 requests in 10.10s,  12.25GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: i512'  https://[::1]:40001 |  22664 requests in 10.10s,  11.11GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: a1024' https://[::1]:40001 |  11815 requests in 10.10s,  11.56GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: c1024' https://[::1]:40001 |  10864 requests in 10.10s,  10.70GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: s1024' https://[::1]:40001 |  14565 requests in 10.10s,  14.25GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: i1024' https://[::1]:40001 |  12749 requests in 10.10s,  12.47GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: a2048' https://[::1]:40001 |   5656 requests in 10.01s,  11.06GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: c2048' https://[::1]:40001 |   5433 requests in 10.10s,  10.70GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: s2048' https://[::1]:40001 |   7628 requests in 10.10s,  14.92GB read
wrk -t8 -c16 -d10s --latency -H 'Body-Conf: i2048' https://[::1]:40001 |   6625 requests in 10.10s,  12.96GB read

Type   req/sec    50%     75%     90%     99%    100%  Resp Size
─────────────────────────────────────────────────────────────────    1kB
array   23522    0.40    0.63    1.66    3.50   10.67      1022
chunk   24581    0.42    0.66    2.27    4.34   11.19      1040
string  25445    0.46    0.76    2.06    3.75   13.53      1022
io      21879    0.54    0.95    3.15    5.15   13.91      1022
─────────────────────────────────────────────────────────────────   10kB
array   22458    0.47    0.74    2.02    3.77   11.16     10238
chunk   21593    0.43    0.65    1.69    4.07   16.12     10318
string  22811    0.49    0.79    2.20    4.11   11.73     10238
io      19156    0.57    0.96    2.70    5.07   16.60     10238
─────────────────────────────────────────────────────────────────  100kB
array    7772    1.10    1.50    2.18   11.60   28.77    102403
chunk    7431    1.33    1.89    5.60   12.58   22.79    103110
string   8768    1.23    2.04    5.79   10.36   18.90    102445
io       6853    1.21    1.58    2.02    6.84   26.84    102406
─────────────────────────────────────────────────────────────────  256kB
array    4089    2.15    2.67    3.22    6.04   32.39    262218
chunk    3646    2.41    3.00    3.61    7.66   30.20    263806
string   4731    2.05    2.65    6.64   21.66   34.47    262144
io       4037    2.21    2.69    3.22    7.24   30.38    262079
─────────────────────────────────────────────────────────────────  512kB
array    2303    3.96    4.75    5.62   11.08   57.11    524597
chunk    2021    4.56    5.54    6.55   11.35   57.56    528084
string   2474    3.73    4.57    5.40    9.28   42.29    524419
io       2244    4.10    4.88    5.85   10.87   61.56    524574
───────────────────────────────────────────────────────────────── 1024kB
array    1170    7.93    9.37   11.32   21.85   95.88   1048425
chunk    1075    8.84   10.37   12.14   22.93   95.74   1055826
string   1442    6.51    7.65    8.91   21.50   60.49   1048690
io       1262    7.57    8.53    9.59   24.34   94.99   1048879
───────────────────────────────────────────────────────────────── 2048kB
array     565   16.64   20.12   24.35   38.16  176.99   2098685
chunk     538   17.94   21.14   24.58   35.24  184.88   2113056
string    755   12.77   14.66   16.75   32.02  155.45   2098783
io        656   14.79   16.34   18.30   37.37  188.06   2098245

Body    ────────── req/sec ──────────   ─────── req 50% times ───────
 KB     array   chunk  string      io   array   chunk  string      io
1       23522   24581   25445   21879   0.405   0.424   0.455   0.537
10      22458   21593   22811   19156   0.472   0.430   0.487   0.568
100      7772    7431    8768    6853   1.100   1.330   1.230   1.210
256      4089    3646    4731    4037   2.150   2.410   2.050   2.210
512      2303    2021    2474    2244   3.960   4.560   3.730   4.100
1024     1170    1075    1442    1262   7.930   8.840   6.510   7.570
2048      565     538     755     656  16.640  17.940  12.770  14.790
─────────────────────────────────────────────────────────────────────
wrk -t8 -c16 -d10s
benchmarks/local/response_time_wrk.sh -w2 -t5:5 -s ssl6 -Y
Server cluster mode -w2 -t5:5, bind: ssl6
Puma repo branch 00-response-refactor
ruby 3.2.0preview2 (2022-09-09 master 35cfc9a3bb) [arm64-darwin21]

[13614] - Gracefully shutting down workers...
[13614] === puma shutdown: 2022-09-13 20:50:47 +0900 ===
[13614] - Goodbye!

 5:14 Total Time

JuanitoFatas avatar Sep 13 '22 12:09 JuanitoFatas

@JuanitoFatas Thanks for the data.

MSP-Greg avatar Sep 13 '22 15:09 MSP-Greg

I updated the notes in the first post, and mentioned two possible issues, correct client (socket) error handling, and possible issues if Rack::Sendfile middleware is used.

Otherwise, I think it's time to own it and merge.

MSP-Greg avatar Sep 13 '22 16:09 MSP-Greg