[WIP] Enable concurrent read from IPFS in replay
This is an initial implementation towards #379. Currently, multi-threading is only implemented in replay. However, initial benchmarking is counter-intuitive.
Without Threading
$ ab -n 1000 -c 10 http://localhost:5000/memento/20160305192247/cs.odu.edu/~salam
This is ApacheBench, Version 2.3 <$Revision: 1757674 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software: nginx
Server Hostname: localhost
Server Port: 5000
Document Path: /memento/20160305192247/cs.odu.edu/~salam
Document Length: 1699 bytes
Concurrency Level: 10
Time taken for tests: 8.807 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 2557000 bytes
HTML transferred: 1699000 bytes
Requests per second: 113.55 [#/sec] (mean)
Time per request: 88.070 [ms] (mean)
Time per request: 8.807 [ms] (mean, across all concurrent requests)
Transfer rate: 283.53 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 1
Processing: 18 88 5.0 87 104
Waiting: 18 87 5.0 87 104
Total: 19 88 5.0 87 105
Percentage of the requests served within a certain time (ms)
50% 87
66% 88
75% 89
80% 89
90% 90
95% 93
98% 100
99% 102
100% 105 (longest request)
With Threading
$ ab -n 1000 -c 10 http://localhost:5000/memento/20160305192247/cs.odu.edu/~salam
This is ApacheBench, Version 2.3 <$Revision: 1757674 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software: nginx
Server Hostname: localhost
Server Port: 5000
Document Path: /memento/20160305192247/cs.odu.edu/~salam
Document Length: 1699 bytes
Concurrency Level: 10
Time taken for tests: 17.141 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 2557000 bytes
HTML transferred: 1699000 bytes
Requests per second: 58.34 [#/sec] (mean)
Time per request: 171.410 [ms] (mean)
Time per request: 17.141 [ms] (mean, across all concurrent requests)
Transfer rate: 145.68 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 1
Processing: 13 171 10.8 171 199
Waiting: 12 170 10.8 171 199
Total: 14 171 10.8 171 199
Percentage of the requests served within a certain time (ms)
50% 171
66% 173
75% 174
80% 175
90% 176
95% 178
98% 185
99% 186
100% 199 (longest request)
@ibnesayeed Why were the rates with threading so much lower? I would expect them to be higher on average.
Why were the rates with threading so much lower? I would expect them to be higher on average.
I am not sure about that yet. I did note that in my initial comment that it is counter-intuitive. It can be due to some threading overhead or our current thread implementation might not be how it should be. A more fine-grained profiling is needed to find out which step is taking how much time.
Codecov Report
Merging #425 into master will increase coverage by
0.09%. The diff coverage is13.04%.
@@ Coverage Diff @@
## master #425 +/- ##
==========================================
+ Coverage 23.29% 23.38% +0.09%
==========================================
Files 6 6
Lines 1112 1116 +4
Branches 169 167 -2
==========================================
+ Hits 259 261 +2
- Misses 836 838 +2
Partials 17 17
| Impacted Files | Coverage Δ | |
|---|---|---|
| ipwb/replay.py | 13.46% <13.04%> (+0.23%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 1aa8d60...b0f7654. Read the comment docs.
@ibnesayeed Should we wait until we can show that this is a more efficient solution before merging or go ahead and do it? This PR should resolve #310 but I would like to check it on a Windows machine to verify that before merging anyway.
This PR brings some functional changes, so we should sit on it a little longer and test it in different ways in different environments first. Also, it is important to profile the code to identify the cause of unexpected slowdown. If the slow down is due to thread overhead then we can find out how to make it more performant. If it is due to the fact that the IPFS server is running on the same machine, then it might perform better when a lookup is performed in the broader IPFS network.
I have been trying to profile this and observed some really strange behaviors. We need to isolate certain things in a separate script and test them.
Also, before merging we need to move some exception handling in threads, because the main thread would not be aware of those.
observed some really strange behaviors
Please document them here.
isolate certain things in a separate script and test them.
Which things? As we talked about in-person, the Py threading mechanism ought to be compared to the base but that is likely not the culprit.
Created a separate benchmarking repository (IPFS API Concurrency Test) and reported the observation in the API repo (https://github.com/ipfs/py-ipfs-api/issues/131).
maybe i'm missing a point, but have you considered to put a wsgi daemon in front of ipwb?
maybe i'm missing a point, but have you considered to put a wsgi daemon in front of ipwb?
No, currently we are serving directly from the built-in Flask server. However, I am not sure how a web server used has anything to do with how the content will be read from the IPFS server.
I am not sure how a web server used has anything to do with how the content will be read from the IPFS server.
not at all.
from your benchmark i had the impression that the aim is to improve response time to web requests. a threading wsgi host process would facilitate that.
if you are onto reading header and payload concurrently, trio is an option for Python 3.5+. (i know, but maybe that would motivate the port to Python 3.)
Thanks for the input @funkyfuture. I was pushing for Py3 support from the day one, but @machawk1 felt there are many people still using systems that only support Py2. However, now we are well-motivated to completely drop the support for Py2 (as per #51). Once we are on Py3, we will explore asyncio for sure.
Note that we are trying to optimize asynchronous requests to IPFS, not the Web. Thanks for your input, @funkyfuture, and as @ibnesayeed said, we will look into using that library as applicable when we can get the rest of the code ported to Py3.