serenity RequestServer: Starting a request can take a long time

When loading websites in Ladybird, the WebContent often ends up waiting quite a long time for RequestServer to respond to the start_request IPC.

Here's an example when loading https://twinings.co.uk/

712533.543 WebContent(1903606): sync response 1270977860::4 took 193ms
712534.107 WebContent(1903606): sync response 1270977860::4 took 29ms
712534.247 WebContent(1903606): sync response 1270977860::4 took 141ms
712534.575 WebContent(1903606): sync response 1270977860::4 took 157ms
712534.911 WebContent(1903606): sync response 1270977860::4 took 193ms
712535.151 WebContent(1903606): sync response 1270977860::4 took 121ms
712537.371 WebContent(1903606): sync response 1270977860::4 took 5ms
712537.923 WebContent(1903606): sync response 1270977860::4 took 269ms
712538.047 WebContent(1903606): sync response 1270977860::4 took 125ms
712538.371 WebContent(1903606): sync response 1270977860::4 took 145ms
712539.855 WebContent(1903606): sync response 1270977860::4 took 109ms
712540.103 WebContent(1903606): sync response 1270977860::4 took 77ms
712542.039 WebContent(1903606): sync response 1270977860::4 took 269ms
712542.359 WebContent(1903606): sync response 1270977860::4 took 321ms
712542.871 WebContent(1903606): sync response 1270977860::4 took 429ms
712544.419 WebContent(1903606): sync response 1270977860::4 took 289ms
712545.875 WebContent(1903606): sync response 1270977860::4 took 101ms

As you can see, we spent 2973ms(!) just blocking on this IPC call when we could have been running JS, doing layout, etc!

I think a good fix here could be to make the start_request IPC be asynchronous. This would allow WebContent to continue executing immediately after sending it.

It would also be good to improve request startup time in RequestServer, but that's a separate problem.

For reproduction, here's the patch that produced the log above:

diff --git a/Userland/Libraries/LibIPC/Connection.cpp b/Userland/Libraries/LibIPC/Connection.cpp
index e98f503c4f..0adbf6ecca 100644
--- a/Userland/Libraries/LibIPC/Connection.cpp
+++ b/Userland/Libraries/LibIPC/Connection.cpp
@@ -197,6 +198,10 @@ ErrorOr<void> ConnectionBase::drain_messages_from_peer()
 
 OwnPtr<IPC::Message> ConnectionBase::wait_for_specific_endpoint_message_impl(u32 endpoint_magic, int message_id)
 {
+    auto timer = Core::ElapsedTimer::start_new();
+    ScopeGuard guard = [&] {
+        dbgln("sync response {}::{} took {}ms", endpoint_magic, message_id, timer.elapsed());
+    };
     for (;;) {
         // Double check we don't already have the event waiting for us.
         // Otherwise we might end up blocked for a while for no reason.

Feb 22 '24 07:02 awesomekling

No need to mess with IPC endpoints, this issue is easily fixable:

diff --git a/Userland/Services/RequestServer/GeminiProtocol.cpp b/Userland/Services/RequestServer/GeminiProtocol.cpp
index 60c36b0579..a675aa9f79 100644
--- a/Userland/Services/RequestServer/GeminiProtocol.cpp
+++ b/Userland/Services/RequestServer/GeminiProtocol.cpp
@@ -31,7 +31,9 @@ OwnPtr<Request> GeminiProtocol::start_request(ConnectionFromClient& client, Byte
     auto protocol_request = GeminiRequest::create_with_job({}, client, *job, move(output_stream));
     protocol_request->set_request_fd(pipe_result.value().read_fd);
 
-    ConnectionCache::get_or_create_connection(ConnectionCache::g_tls_connection_cache, url, job, proxy_data);
+    Core::EventLoop::current().deferred_invoke([=] {
+        ConnectionCache::get_or_create_connection(ConnectionCache::g_tls_connection_cache, url, job, proxy_data);
+    });
 
     return protocol_request;
 }
diff --git a/Userland/Services/RequestServer/HttpCommon.h b/Userland/Services/RequestServer/HttpCommon.h
index a3885ff85b..3e6070c348 100644
--- a/Userland/Services/RequestServer/HttpCommon.h
+++ b/Userland/Services/RequestServer/HttpCommon.h
@@ -102,10 +102,12 @@ OwnPtr<Request> start_request(TBadgedProtocol&& protocol, ConnectionFromClient&
     auto protocol_request = TRequest::create_with_job(forward<TBadgedProtocol>(protocol), client, (TJob&)*job, move(output_stream));
     protocol_request->set_request_fd(pipe_result.value().read_fd);
 
-    if constexpr (IsSame<typename TBadgedProtocol::Type, HttpsProtocol>)
-        ConnectionCache::get_or_create_connection(ConnectionCache::g_tls_connection_cache, url, job, proxy_data);
-    else
-        ConnectionCache::get_or_create_connection(ConnectionCache::g_tcp_connection_cache, url, job, proxy_data);
+    Core::EventLoop::current().deferred_invoke([=] {
+        if constexpr (IsSame<typename TBadgedProtocol::Type, HttpsProtocol>)
+            ConnectionCache::get_or_create_connection(ConnectionCache::g_tls_connection_cache, url, job, proxy_data);
+        else
+            ConnectionCache::get_or_create_connection(ConnectionCache::g_tcp_connection_cache, url, job, proxy_data);
+    });
 
     return protocol_request;
 }

The connection doesn't need to start right away, though this seems to break loading that very same twinings.co.uk (no idea why, looking into that before PR'ing)

Feb 24 '24 13:02 alimpfard

Master: Average time: 85.24074ms, hits: 88/108 Async connection: Average time: 190.96297ms, hits: 71/108 No idea why the delay nearly doubles, but despite that, page load time goes down by ~18% (47s -> 38s)

Feb 24 '24 14:02 alimpfard

37s with the RS IPC endpoints fully async, but the time spent connecting has also gone way down, plus almost all connections are now cached:

Average time: 0.018518519ms, hits: 107/108

Feb 26 '24 11:02 alimpfard

serenity serenity copied to clipboard

RequestServer: Starting a request can take a long time

serenity
serenity copied to clipboard