Tool executions on streaming models lead to `The current thread cannot be blocked: vert.x-eventloop-thread-0`
When combining a streaming model with a tool, that does e.g. a rest call, it leads to the following error message:
The current thread cannot be blocked: vert.x-eventloop-thread-0
Here is an example:
@Singleton
static class MyTool {
@Tool(value = "the answer to life, the universe and everything")
public String answer() {
return Uni.createFrom().item("42")
.onItem().delayIt().by(Duration.ofMillis(500))
.await().indefinitely();
}
}
@RegisterAiService(tools = MyTool.class)
interface MyService {
Multi<String> chat(String msg);
}
The reason is, that
https://github.com/quarkiverse/quarkus-langchain4j/blob/0cd6f69fc7092b67534e7d60ef2ec87df4d8f88d/core/runtime/src/main/java/io/quarkiverse/langchain4j/runtime/aiservice/AiServiceMethodImplementationSupport.java#L572
tries to execute the tool on the event loop.
I think this should be easily fixable by running it on Infrastructure.getDefaultWorkerPool().
I can contribute a fix, but implementing a test will not be possible without merging first #844.
This is not to be confused withe the request for supporting Uni as in #837.
There is a request on async tool execution in lang chain: https://github.com/langchain4j/langchain4j/issues/1183 but it doesn't seem to get implemented soon.
@langchain4j is receptive to PRs :). And I can help with the review as well
Hmm, I am not sure https://github.com/langchain4j/langchain4j/issues/1183 will help
There is more to this. Yes it would be awesome that langchain4j would provide an async api. I totally agree with that, but this is at the moment not the case. We did some tests locally with ChatMemoryStore, Streaming etc. and the extension calls blocking methods on the eventloop thread. As far as we could see this makes the application unusable. If only one person would use the chat and we store the chat messages for example in a dynamodb those operations would block the application for everyone else. E.g.:
Thread Thread[vert.x-eventloop-thread-7,5,main] has been blocked for 2704 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
at java.base/jdk.internal.misc.Unsafe.park(Native Method)
at java.base/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
at java.base/java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1864)
at java.base/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
at java.base/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
at java.base/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1898)
at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2072)
at io.smallrye.context.CompletableFutureWrapper.get(CompletableFutureWrapper.java:152)
at our.project.DynamoChatMemoryStore.getMessages(DynamoChatMemoryStore.java:55)
at dev.langchain4j.memory.chat.MessageWindowChatMemory.messages(MessageWindowChatMemory.java:80)
at io.quarkiverse.langchain4j.runtime.aiservice.AiServiceMethodImplementationSupport.prepareSystemMessage(AiServiceMethodImplementationSupport.java:679)
at io.quarkiverse.langchain4j.runtime.aiservice.AiServiceMethodImplementationSupport.doImplement(AiServiceMethodImplementationSupport.java:172)
at io.quarkiverse.langchain4j.runtime.aiservice.AiServiceMethodImplementationSupport.implement(AiServiceMethodImplementationSupport.java:154)
at io.quarkiverse.langchain4j.runtime.aiservice.MethodImplementationSupportProducer$1$1.apply(MethodImplementationSupportProducer.java:31)
at io.quarkiverse.langchain4j.runtime.aiservice.MethodImplementationSupportProducer$1$1.apply(MethodImplementationSupportProducer.java:28)
We also sometimes see Thread blocked during application start without any user questions, so some init steps also seems to block the event loop thread. I have no example for this at the moment though.
If we use something like TokenStream and use it to an SSE endpoint basically, the MCP tool usage is stuck when calling listTools. I could not figure out why this is the case. I only see the async call for listTools and then I got lost in the RestClient execution. This also freezes the application somehow and any other rest request is never processed.
I think the ChatJsonRPCService (dev-ui) works best at the moment. But "things" are different there. Except the ChatMemory topic, because this also freezes the event loop thread during multi creation.
I did not open new issues or PRs to solve something, because I do not even know where to start. There is already a lot of things implemented, but due to the non-reactive api of langchain4j it seems it was not considered, that some operations are blocking.