http2: Move all socket operations to the session thread
This is a port of a change initially made by @mbgrydeland first in Varnish Enterprise for planning reasons.
It's a major change of the architecture:
- all socket operations happen in the
h2_sessthread - the session's file descriptor switches to non-blocking
- coordination with
h2_reqthreads is done with a file descriptor- based on #4348 originally submitted independently
- this enables polling together with the socket
- control frames are sent from a fixed-length buffer living in workspace
I was the main reviewer of the change, and I personally extracted several commits (@mbgrydeland did too based on my reviews) in order to reduce the size of the main commit as much as possible:
http2: Rework HTTP/2 to be using non-blocking sockets
After porting the change to this branch, I wish I had more aggressively extracted small changes into self-contained commits because the amount of conflicts to resolve made the port of this commit alone several days of work.
There were a lot of explicit (line) conflicts, and a fair amount of implicit commits because at this point trunk and Varnish Enterprise (based on 6.0) diverged in many small ways:
- Varnish Enterprise notably has
- a waiting list "walk-away" feature (see #3835)
- listen socket management (#3959)
- task tracking with a
varnishscoreboardutility
- and Varnish Cache has
- a workspace emulator and hardened workspace semantics (#3644)
- I defaulted to ws_emu+asan in my porting endeavor, it actually helped
- a generalized VDP API (#4035)
- a new asynchronous iterator for delivery (#4209)
- a workspace emulator and hardened workspace semantics (#3644)
Between these major changes (and probably others I'm forgetting) and lots of small core changes accumulated over the years, there was pretty much always a merge conflict for all the commits I cherry-picked. I snuck two original commits at the beginning, to reduce noise when conflicts happened during the port.
I caught two minor bugs in this port, and besides the workspace conflict, things went rather smoothly. It was just tedious to replay, so tedious that I frequently switched to other (important) tasks, which added delays. But I ran the test suite for each commit, stressing them with load, the workspace emulator paired with sanitizers.
The original change went to production, facing real world h2 traffic, that's why the patch series ends with a bunch of bug fixes. For similar workloads it's putting overall a lower load on the system, and it's more reactive. If you wish to use OpenSSL, you can't perform reads and writes concurrently, so moving all socket operations to the session thread also removes a good deal of contention for HTTPS workloads.
I'm requesting @mbgrydeland's review so he can make sure that the port is faithful to our original work. I'm also requesting @walid-git's review because he rebased his trailer implementation (Varnish Enterprise counterpart of #4125) on this architectural change (and eventually he should do the same for #4125). It was reportedly easier to implement with this new architecture.
I have one demand.
Leave structural changes aside if they can wait until after merging this. If they must really happen as part of this change, it will require new commits. I didn't apply the bug fixes in the commit introducing the bugs because keeping the same list of commits will help us track what has or has not been ported either way (among commits relevant in both projects).
I'm otherwise open to squashing whitespace improvements or general polish in the right commits, but I will be away for a couple weeks.