serenity
serenity copied to clipboard
AK+LibCore+LibHTTP+LibTest: Introduce more bugs in asynchronous code
You thought you mastered idiosyncrasies of AK::Stream? This PR gets you covered by adding a similar but completely unrelated way of doing datastreams.
You thought you figured out how to write event-driven asynchronous code in Serenity? This PR gets you covered by adding a similar but subtly different way for dealing with asynchronousness.
You thought you knew how and when lifetimes are ended? This PR gets you covered by complicating things so much you would start to question your C++ skills.
You thought you knew how to read backtraces in GDB? This PR gets you covered by converting backtraces into an unreadable mess ~~going both in call-stack and reverse call-stack directions while~~ interleaving actual calls with visual clutter.
You thought you knew what I would upstream first? This PR gets you covered by choosing a different subset of features to contribute.
As much as I would like to leave only the first 5 sentences in the PR description, I guess I have to properly explain some things here. So, I think coroutines are ready to be experimented on in tree. Thus, this PR implements AK::Coroutine
, defines and gives an extremely detailed description of AK::Async{Input,Output,}Stream
s interfaces, and implements a very bare-bones fully asynchronous HTTP/1.1 client.
In order to prove that the architecture I chose for asynchronous streams is adequate, I've implemented a variety of streams, stream adapters, and stream wrappers, including but not limiting to AK::AsyncLexingAdapter
for consuming data until a predefined delimiter, AK::AsyncInputStreamSlice
for treating a prefix of the stream with some predefined length as a stream by itself, Test::AsyncMemory{Input,Output}Stream
for treating memory locations as streams, HTTP::(anonymous namespace)::ChunkedBodyStream
for treating a Transfer-Encoding: chunked
-encoded HTTP response body as a stream, and (not present in this PR) AK::AsyncInputStreamLegacyTransform
for treating an asynchronous stream transformed by legacy AK::Stream
as asynchronous stream. All of these classes have a very simple and (after you spend dozens of hours using AK::Coroutine
) intuitive implementations. Therefore, I don't expect the fundamental design of streams to change much anymore.
I know, however, that there are a few minor deficiencies in the current asynchronous framework. ~~Namely, asynchronous input streams are basically asynchronous generators and it would be very inconvenient and error-prone to write more complicated transformations as hand-unrolled state machines (you can see what I mean by looking at ChunkedBodyStream and comment just above its enqueue_some
). Additionally, current lockless design of AsyncOutputStream::write
won't work for HTTP/2. However, these two problems can be dealt with later without significant changes to the code I add in this PR.~~ Generated streams have been implemented. However, I now think that AsyncOutputStream design is just plain wrong, but again, it can be improved later.
Note that this PR doesn't contain asynchronous TCP socket as it requires a bit more work in general and Serenity lacks necessary runtime functions to support it. This doesn't mean that the code here can't be tested -- Test::Async{Input,Output}Stream combined using AK::AsyncStreamPair
work well as a replacement for socket in tests (and actually allow proper unit testing of HTTP code).
What are some tangible benefits of this approach? You've spent several paragraphs joking about how this makes the code harder to understand and debug, but haven't said a word about what it makes better.
What are the throughput and latency improvements from these changes? Surely there is some measurable performance gain to justify the increase in complexity :)
@awesomekling, actually I argue that it is easier to write correct asynchronous code with coroutines than without. For example, here is a streamable asynchronous implementation of deflate decompression. In total, it has 17 states across 3 different state machines, all of which were generated automatically. This would have been a total nightmare to write manually.
Why do we need asynchronous algorithms in the first place? Well, to not have threading monstrosity like CxBoog is introducing in #24313. We don't want to start a new thread for every compressed response needed to be streamed, do we?
Surely there is some measurable performance gain to justify the increase in complexity :)
In general, a synchronous implementation can always be made faster than an asynchronous one just because the former doesn't need to track its state. Coroutines selling point is not performance, it is interleaving. Despite that, since I designed an asynchronous datastream framework for the ground up in this PR, I made sure it can be made more performant than old streams (I'm also planning to gradually transition old streams to the new design later since I think it is much better). And, in fact, the inflate implementation I linked is 10% faster than our current synchronous implementation (despite them sharing a lot of code and the algorithm and the fact that I spend a grand total of 15 minutes optimizing it).
Well, at least this is interesting 🧐
Changes in the last push:
- Rebase on top of master
- Fix compile error in
Coroutine<void>::Coroutine(Coroutine<void>&&)
noticed by Ali (and add a test for it) - Add a test for 0-length read of
AsyncInputStream
after EOF has been reached
Oops, not sure how this has happened
HTTP client possibly pending removal if left unused for a while, let's dew it.