quarkus-langchain4j icon indicating copy to clipboard operation
quarkus-langchain4j copied to clipboard

Support Multi<ChatCompletionResponse> AI services

Open dastrobu opened this issue 1 year ago • 12 comments

When declaring an AI service with signature:

public interface AiService {
    @SystemMessage("You are a professional poet")
    @UserMessage("""
            Write a poem about {topic}. The poem should be {lines} lines long.
        """)
    Multi<ChatCompletionResponse> writeAPoem(String topic, int lines);
}

I get

dev.langchain4j.exception.IllegalConfigurationException: Only Multi<String> is supported as a Multi return type. Offending method is 'fooAiService#writeAPoem'
	at dev.langchain4j.exception.IllegalConfigurationException.illegalConfiguration(IllegalConfigurationException.java:12)
	at io.quarkiverse.langchain4j.deployment.AiServicesProcessor.handleDeclarativeServices(AiServicesProcessor.java:442)

To be able to access the metadata of the response, such as usage or finishReason it would be necessary to access the underlying response objects.

dastrobu avatar Aug 27 '24 08:08 dastrobu

We can probably add support for that.

geoand avatar Aug 27 '24 08:08 geoand

To be able to access the metadata of the response, such as usage or finishReason it would be necessary to access the underlying response objects.

How do you plan to use these in a streaming fashion (as Multi implies)?

geoand avatar Aug 27 '24 10:08 geoand

@geoand you would check on the finish_reason of the last event before the [DONE] event.

Here is an example of a raw api call (with max_tokens set to 1):

data: {"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"delta":{"content":"It"},"finish_reason":null,"index":0,"logprobs":null}],"created":1724759298,"id":"chatcmpl-A0oyIz0Qfd9Hq22NwN3VyT7FtgCzb","model":"gpt-4o-2024-05-13","object":"chat.completion.chunk","system_fingerprint":"fp_abc28019ad"}

data: {"choices":[{"content_filter_results":{},"delta":{},"finish_reason":"length","index":0,"logprobs":null}],"created":1724759298,"id":"chatcmpl-A0oyIz0Qfd9Hq22NwN3VyT7FtgCzb","model":"gpt-4o-2024-05-13","object":"chat.completion.chunk","system_fingerprint":"fp_abc28019ad"}

data: [DONE]

As you can see, the last event shows "finish_reason":"length", while previous events have "finish_reason":null.

dastrobu avatar Aug 27 '24 11:08 dastrobu

Gotcha, thanks!

geoand avatar Aug 28 '24 07:08 geoand

@jmartisk do you have any spare cycles to look into this?

My guess is that it shouldn't take more than a couple hours for someone who knows the codebase :)

geoand avatar Aug 29 '24 10:08 geoand

@geoand I took a look at the issue, I have a PR in my forked repo, and think we should support this case otherwise there's not way of associating the metadata with the request. Also the the auditing events doesn't really work (at least the completion event) in streaming mode (it's easily fixable). Do you mind taking a look at the PR at https://github.com/tomas1885/quarkus-langchain4j/pull/1 and let me know if I should open it in this repo?

tomas1885 avatar Jul 27 '25 09:07 tomas1885

@tomas1885 thanks! Please open the PR against this repo

geoand avatar Jul 28 '25 05:07 geoand

@geoand It works but I'm not convinced its the right path. I'll open a discussion with some question in the discussions section.

tomas1885 avatar Jul 28 '25 09:07 tomas1885

+1

geoand avatar Jul 28 '25 09:07 geoand

@geoand How about allowing to return a multi with events for each of the actual TokenStream, I started a discussion in https://github.com/quarkiverse/quarkus-langchain4j/discussions/1652, and have branch ready, but I think there might be room for improvements.

tomas1885 avatar Jul 29 '25 11:07 tomas1885

@tomas1885 it would be a lot easier to just open a draft PR so we can discuss in context, as I am not sure I understand what you are proposing.

Thanks!

geoand avatar Jul 30 '25 06:07 geoand

@geoand I opened a draft PR

tomas1885 avatar Jul 30 '25 14:07 tomas1885