serenity icon indicating copy to clipboard operation
serenity copied to clipboard

RequestServer: Starting a request can take a long time

Open awesomekling opened this issue 1 year ago • 2 comments

When loading websites in Ladybird, the WebContent often ends up waiting quite a long time for RequestServer to respond to the start_request IPC.

Here's an example when loading https://twinings.co.uk/

712533.543 WebContent(1903606): sync response 1270977860::4 took 193ms
712534.107 WebContent(1903606): sync response 1270977860::4 took 29ms
712534.247 WebContent(1903606): sync response 1270977860::4 took 141ms
712534.575 WebContent(1903606): sync response 1270977860::4 took 157ms
712534.911 WebContent(1903606): sync response 1270977860::4 took 193ms
712535.151 WebContent(1903606): sync response 1270977860::4 took 121ms
712537.371 WebContent(1903606): sync response 1270977860::4 took 5ms
712537.923 WebContent(1903606): sync response 1270977860::4 took 269ms
712538.047 WebContent(1903606): sync response 1270977860::4 took 125ms
712538.371 WebContent(1903606): sync response 1270977860::4 took 145ms
712539.855 WebContent(1903606): sync response 1270977860::4 took 109ms
712540.103 WebContent(1903606): sync response 1270977860::4 took 77ms
712542.039 WebContent(1903606): sync response 1270977860::4 took 269ms
712542.359 WebContent(1903606): sync response 1270977860::4 took 321ms
712542.871 WebContent(1903606): sync response 1270977860::4 took 429ms
712544.419 WebContent(1903606): sync response 1270977860::4 took 289ms
712545.875 WebContent(1903606): sync response 1270977860::4 took 101ms

As you can see, we spent 2973ms(!) just blocking on this IPC call when we could have been running JS, doing layout, etc!

I think a good fix here could be to make the start_request IPC be asynchronous. This would allow WebContent to continue executing immediately after sending it.

It would also be good to improve request startup time in RequestServer, but that's a separate problem.

For reproduction, here's the patch that produced the log above:

diff --git a/Userland/Libraries/LibIPC/Connection.cpp b/Userland/Libraries/LibIPC/Connection.cpp
index e98f503c4f..0adbf6ecca 100644
--- a/Userland/Libraries/LibIPC/Connection.cpp
+++ b/Userland/Libraries/LibIPC/Connection.cpp
@@ -197,6 +198,10 @@ ErrorOr<void> ConnectionBase::drain_messages_from_peer()
 
 OwnPtr<IPC::Message> ConnectionBase::wait_for_specific_endpoint_message_impl(u32 endpoint_magic, int message_id)
 {
+    auto timer = Core::ElapsedTimer::start_new();
+    ScopeGuard guard = [&] {
+        dbgln("sync response {}::{} took {}ms", endpoint_magic, message_id, timer.elapsed());
+    };
     for (;;) {
         // Double check we don't already have the event waiting for us.
         // Otherwise we might end up blocked for a while for no reason.

awesomekling avatar Feb 22 '24 07:02 awesomekling

No need to mess with IPC endpoints, this issue is easily fixable:

diff --git a/Userland/Services/RequestServer/GeminiProtocol.cpp b/Userland/Services/RequestServer/GeminiProtocol.cpp
index 60c36b0579..a675aa9f79 100644
--- a/Userland/Services/RequestServer/GeminiProtocol.cpp
+++ b/Userland/Services/RequestServer/GeminiProtocol.cpp
@@ -31,7 +31,9 @@ OwnPtr<Request> GeminiProtocol::start_request(ConnectionFromClient& client, Byte
     auto protocol_request = GeminiRequest::create_with_job({}, client, *job, move(output_stream));
     protocol_request->set_request_fd(pipe_result.value().read_fd);
 
-    ConnectionCache::get_or_create_connection(ConnectionCache::g_tls_connection_cache, url, job, proxy_data);
+    Core::EventLoop::current().deferred_invoke([=] {
+        ConnectionCache::get_or_create_connection(ConnectionCache::g_tls_connection_cache, url, job, proxy_data);
+    });
 
     return protocol_request;
 }
diff --git a/Userland/Services/RequestServer/HttpCommon.h b/Userland/Services/RequestServer/HttpCommon.h
index a3885ff85b..3e6070c348 100644
--- a/Userland/Services/RequestServer/HttpCommon.h
+++ b/Userland/Services/RequestServer/HttpCommon.h
@@ -102,10 +102,12 @@ OwnPtr<Request> start_request(TBadgedProtocol&& protocol, ConnectionFromClient&
     auto protocol_request = TRequest::create_with_job(forward<TBadgedProtocol>(protocol), client, (TJob&)*job, move(output_stream));
     protocol_request->set_request_fd(pipe_result.value().read_fd);
 
-    if constexpr (IsSame<typename TBadgedProtocol::Type, HttpsProtocol>)
-        ConnectionCache::get_or_create_connection(ConnectionCache::g_tls_connection_cache, url, job, proxy_data);
-    else
-        ConnectionCache::get_or_create_connection(ConnectionCache::g_tcp_connection_cache, url, job, proxy_data);
+    Core::EventLoop::current().deferred_invoke([=] {
+        if constexpr (IsSame<typename TBadgedProtocol::Type, HttpsProtocol>)
+            ConnectionCache::get_or_create_connection(ConnectionCache::g_tls_connection_cache, url, job, proxy_data);
+        else
+            ConnectionCache::get_or_create_connection(ConnectionCache::g_tcp_connection_cache, url, job, proxy_data);
+    });
 
     return protocol_request;
 }

The connection doesn't need to start right away, though this seems to break loading that very same twinings.co.uk (no idea why, looking into that before PR'ing)

alimpfard avatar Feb 24 '24 13:02 alimpfard

Master: Average time: 85.24074ms, hits: 88/108 Async connection: Average time: 190.96297ms, hits: 71/108 No idea why the delay nearly doubles, but despite that, page load time goes down by ~18% (47s -> 38s)

alimpfard avatar Feb 24 '24 14:02 alimpfard

37s with the RS IPC endpoints fully async, but the time spent connecting has also gone way down, plus almost all connections are now cached:

Average time: 0.018518519ms, hits: 107/108

alimpfard avatar Feb 26 '24 11:02 alimpfard