This is an initial implementation towards #379. Currently, multi-threading is only implemented in replay. However, initial benchmarking is counter-intuitive.

Without Threading

$ ab -n 1000 -c 10 http://localhost:5000/memento/20160305192247/cs.odu.edu/~salam
This is ApacheBench, Version 2.3 <$Revision: 1757674 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests


Server Software:        nginx
Server Hostname:        localhost
Server Port:            5000

Document Path:          /memento/20160305192247/cs.odu.edu/~salam
Document Length:        1699 bytes

Concurrency Level:      10
Time taken for tests:   8.807 seconds
Complete requests:      1000
Failed requests:        0
Total transferred:      2557000 bytes
HTML transferred:       1699000 bytes
Requests per second:    113.55 [#/sec] (mean)
Time per request:       88.070 [ms] (mean)
Time per request:       8.807 [ms] (mean, across all concurrent requests)
Transfer rate:          283.53 [Kbytes/sec] received

Connection Times (ms)
             min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       1
Processing:    18   88   5.0     87     104
Waiting:       18   87   5.0     87     104
Total:         19   88   5.0     87     105

Percentage of the requests served within a certain time (ms)
 50%     87
 66%     88
 75%     89
 80%     89
 90%     90
 95%     93
 98%    100
 99%    102
100%    105 (longest request)

With Threading

$ ab -n 1000 -c 10 http://localhost:5000/memento/20160305192247/cs.odu.edu/~salam
This is ApacheBench, Version 2.3 <$Revision: 1757674 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests


Server Software:        nginx
Server Hostname:        localhost
Server Port:            5000

Document Path:          /memento/20160305192247/cs.odu.edu/~salam
Document Length:        1699 bytes

Concurrency Level:      10
Time taken for tests:   17.141 seconds
Complete requests:      1000
Failed requests:        0
Total transferred:      2557000 bytes
HTML transferred:       1699000 bytes
Requests per second:    58.34 [#/sec] (mean)
Time per request:       171.410 [ms] (mean)
Time per request:       17.141 [ms] (mean, across all concurrent requests)
Transfer rate:          145.68 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       1
Processing:    13  171  10.8    171     199
Waiting:       12  170  10.8    171     199
Total:         14  171  10.8    171     199

Percentage of the requests served within a certain time (ms)
  50%    171
  66%    173
  75%    174
  80%    175
  90%    176
  95%    178
  98%    185
  99%    186
 100%    199 (longest request)

Jul 09 '18 04:07 ibnesayeed

@ibnesayeed Why were the rates with threading so much lower? I would expect them to be higher on average.

Jul 09 '18 15:07 machawk1

Why were the rates with threading so much lower? I would expect them to be higher on average.

I am not sure about that yet. I did note that in my initial comment that it is counter-intuitive. It can be due to some threading overhead or our current thread implementation might not be how it should be. A more fine-grained profiling is needed to find out which step is taking how much time.

Jul 09 '18 17:07 ibnesayeed

Codecov Report

Merging #425 into master will increase coverage by 0.09%. The diff coverage is 13.04%.

@@            Coverage Diff             @@
##           master     #425      +/-   ##
==========================================
+ Coverage   23.29%   23.38%   +0.09%     
==========================================
  Files           6        6              
  Lines        1112     1116       +4     
  Branches      169      167       -2     
==========================================
+ Hits          259      261       +2     
- Misses        836      838       +2     
  Partials       17       17

Impacted Files	Coverage Δ
ipwb/replay.py	`13.46% <13.04%> (+0.23%)`	:arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 1aa8d60...b0f7654. Read the comment docs.

Jul 09 '18 17:07 codecov[bot]

@ibnesayeed Should we wait until we can show that this is a more efficient solution before merging or go ahead and do it? This PR should resolve #310 but I would like to check it on a Windows machine to verify that before merging anyway.

Jul 09 '18 19:07 machawk1

This PR brings some functional changes, so we should sit on it a little longer and test it in different ways in different environments first. Also, it is important to profile the code to identify the cause of unexpected slowdown. If the slow down is due to thread overhead then we can find out how to make it more performant. If it is due to the fact that the IPFS server is running on the same machine, then it might perform better when a lookup is performed in the broader IPFS network.

Jul 09 '18 19:07 ibnesayeed

I have been trying to profile this and observed some really strange behaviors. We need to isolate certain things in a separate script and test them.

Also, before merging we need to move some exception handling in threads, because the main thread would not be aware of those.

Jul 14 '18 03:07 ibnesayeed

observed some really strange behaviors

Please document them here.

isolate certain things in a separate script and test them.

Which things? As we talked about in-person, the Py threading mechanism ought to be compared to the base but that is likely not the culprit.

Jul 14 '18 03:07 machawk1

Created a separate benchmarking repository (IPFS API Concurrency Test) and reported the observation in the API repo (https://github.com/ipfs/py-ipfs-api/issues/131).

Jul 16 '18 06:07 ibnesayeed

maybe i'm missing a point, but have you considered to put a wsgi daemon in front of ipwb?

Aug 19 '18 13:08 funkyfuture

maybe i'm missing a point, but have you considered to put a wsgi daemon in front of ipwb?

No, currently we are serving directly from the built-in Flask server. However, I am not sure how a web server used has anything to do with how the content will be read from the IPFS server.

Aug 19 '18 15:08 ibnesayeed

I am not sure how a web server used has anything to do with how the content will be read from the IPFS server.

not at all.

from your benchmark i had the impression that the aim is to improve response time to web requests. a threading wsgi host process would facilitate that.

if you are onto reading header and payload concurrently, trio is an option for Python 3.5+. (i know, but maybe that would motivate the port to Python 3.)

Aug 19 '18 19:08 funkyfuture

Thanks for the input @funkyfuture. I was pushing for Py3 support from the day one, but @machawk1 felt there are many people still using systems that only support Py2. However, now we are well-motivated to completely drop the support for Py2 (as per #51). Once we are on Py3, we will explore asyncio for sure.

Aug 19 '18 19:08 ibnesayeed

Note that we are trying to optimize asynchronous requests to IPFS, not the Web. Thanks for your input, @funkyfuture, and as @ibnesayeed said, we will look into using that library as applicable when we can get the rest of the code ported to Py3.

Aug 19 '18 19:08 machawk1

[WIP] Enable concurrent read from IPFS in replay

Codecov Report