feat(server): spawn task sooner in listenloop
- ~Uses
Threads.@spawninstead of@asyncin listenloop~ - moves the ssl handshake into the spawned task, so that the next accept happens faster
- adds 5 second timeout to ssl handshake
- makes ssl lockable because I don't trust MbedTLS to work in a potentially multi-threaded environment
- makes
connsdict lockable because that's also not thread safe in itself if I understand correctly.
(this work has a lot of ideas and snippets from @tanmaykm 🙏🏽)
Codecov Report
Attention: 5 lines in your changes are missing coverage. Please review.
Comparison is base (
15c6451) 82.71% compared to head (e608d08) 82.57%.
| Files | Patch % | Lines |
|---|---|---|
| src/Servers.jl | 81.48% | 5 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #1102 +/- ##
==========================================
- Coverage 82.71% 82.57% -0.14%
==========================================
Files 32 32
Lines 3054 3065 +11
==========================================
+ Hits 2526 2531 +5
- Misses 528 534 +6
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Uses Threads.@spawn instead of @async in listenloop
I think using @async rather than @spawn was a conscious decision, but i can't remember why now -- so hopefully @quinnj remembers 😊
Uses Threads.@Spawn instead of @async in listenloop
I think using
@asyncrather than@spawnwas a conscious decision, but i can't remember why now -- so hopefully @quinnj remembers 😊
Seeing old PRs it may be either that Threads.@spawn may be more expensive or that the connections dict is not thread safe (I now lock it). And, of course, this is definitely a breaking change now that I think of it, because user's f in handle_connection is definitely not 100% thread safe so that's probably best suited for another PR.
And, of course, this is definitely a breaking change now that I think of it, because user's f in handle_connection is definitely not 100% thread safe so that's probably best suited for another PR.
Yes, i think that's it. And i think we rely on that in our internal Server code (at RAI)... i see this comment in some internal server code
# HTTP.serve! uses `@async` under the hood. Because of `@async` and because there are no
# yield points during server state manipulation, we can modify the server state within
# `handle_*` methods without having to worry about thread safety, like we would if
# HTTP.serve! used `@spawn` to handle requests. Actual work is done in the `@spawn`ed
# tasks which **don't** access the `Server`.
Makes sense. I think the spirit of this PR doesn't need Threads.@spawn. I'll have another one in the making with another idea that may fit better. Thanks for taking a look 🤗
I'm not sure how to fix the invalidations :(
~Also, it seems that not locking MbedTLS breaks down everything, with the listening socket ending up in a FIN_WAIT2.
Locking on the other hand, seems to introduce a significant slowdown. I'll convert this to a draft for now, until I have more time to work on it.~ nvm, just had a bug.
Sorry for the confusion, I just missed an UndefVarErr for a second. @quinnj I'm not sure how to fix the invalidations issue, but the PR should be ready for review other than that. Unfortunalely, I can't get a performance improvement out of it, but it's not slower either (both this PR and #master are around 35 requests/second)
Benchmarked it on a simple hello world endpoint with ab
before
Requests per second: 54.70 [#/sec] (mean)
Time per request: 1827.989 [ms] (mean)
Time per request: 18.280 [ms] (mean, across all concurrent requests)
Transfer rate: 10.47 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 1 8 14.0 1 51
Processing: 78 1329 292.0 1432 1856
Waiting: 78 1328 292.0 1432 1847
Total: 98 1336 283.7 1435 1858
after
Requests per second: 69.11 [#/sec] (mean)
Time per request: 1447.051 [ms] (mean)
Time per request: 14.471 [ms] (mean, across all concurrent requests)
Transfer rate: 13.23 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 1 8 14.6 1 53
Processing: 73 1293 315.3 1419 1466
Waiting: 63 1293 315.4 1419 1466
Total: 75 1302 306.4 1421 1513
we rely on that in our internal Server code (at RAI)...
To clarify, we're still in favour of changing @async -> @spawn.
At RAI we currently have some code that is written based on the assumption HTTP is using @async here. We'd love to see HTTP.jl move away from @async and use @spawn instead, we'd just want that to be done in a breaking release (and of course we'd need to rewrite some internal code before upgrading to that new HTTP.jl version)