jabref icon indicating copy to clipboard operation
jabref copied to clipboard

Support GPT4All Server API

Open ThiloteE opened this issue 1 year ago • 12 comments

Describtion of solution:

I want to use JabRef's AI feature locally. There are multiple applications out there that provide a server API. They very often offer an API that resembles OpenAI API.

GPT4All is such an application. Others are Llama.cpp, Ollama, LMStudio, Jan, KobolCPP. I am sure, there are more, but those are the most well known ones.

The grand advantage of those applications is that they offer more samplers, GPU acceleration, hardware support and support for models that have not been added to JabRef.

Problem

It kinda works with GPT4All already, but something is wrong. I believe the embeddings are not sent together with the prompt and responses look like they are cutoff in the middle.

GPT4All: image

JabRef: image

Additional context

  • GPT4All documentation for local API server
  • The phi-3.1-mini-128k-instruct model can be downloaded here: https://huggingface.co/GPT4All-Community/Phi-3.1-mini-128k-instruct-GGUF/tree/main. Just move the model file into the model directory of GPT4All and then configure it in the model settings as shown in my screenshots down below.
  • Documentation about how to configure other custom models: https://github.com/nomic-ai/gpt4all/wiki/Configuring-Custom-Models.

JabRef preferences: image

GPT4All preferences: image

GPT4All model settings 1: image

GPT4All model settings 2: image

ThiloteE avatar Oct 01 '24 11:10 ThiloteE

I also often get those errors/warnings in the commandline, when I try to send messages when connected to GPT4All server API. Not sure, if related.

2024-10-01 13:13:53 [pool-2-thread-4] org.jabref.logic.ai.chatting.AiChatLogic.execute()
INFO: Sending message to AI provider (https://api.openai.com/v1) for answering in entry CooperEtAl200708cah: What are the authors of the paper?
2024-10-01 13:13:53 [JavaFX Application Thread] org.jabref.gui.ai.components.aichat.AiChatComponent.lambda$onSendMessage$11()
ERROR: Got an error while sending a message to AI: io.github.stefanbratanov.jvm.openai.OpenAIException: 400 - message: Invalid 'messages[2].role': did not expect 'user' here, type: invalid_request_error, param: null, code: null
        at [email protected]/io.github.stefanbratanov.jvm.openai.OpenAIClient.lambda$validateHttpResponse$6(OpenAIClient.java:129)
        at java.base/java.util.Optional.ifPresentOrElse(Optional.java:196)
        at [email protected]/io.github.stefanbratanov.jvm.openai.OpenAIClient.validateHttpResponse(OpenAIClient.java:127)
        at [email protected]/io.github.stefanbratanov.jvm.openai.OpenAIClient.sendHttpRequest(OpenAIClient.java:85)
        at [email protected]/io.github.stefanbratanov.jvm.openai.OpenAIClient.sendHttpRequest(OpenAIClient.java:78)
        at [email protected]/io.github.stefanbratanov.jvm.openai.ChatClient.createChatCompletion(ChatClient.java:37)
        at [email protected]/org.jabref.logic.ai.chatting.model.JvmOpenAiChatLanguageModel.generate(JvmOpenAiChatLanguageModel.java:65)
        at [email protected]/org.jabref.logic.ai.chatting.model.JabRefChatLanguageModel.generate(JabRefChatLanguageModel.java:142)
        at [email protected]/dev.langchain4j.chain.ConversationalRetrievalChain.execute(ConversationalRetrievalChain.java:85)
        at [email protected]/dev.langchain4j.chain.ConversationalRetrievalChain.execute(ConversationalRetrievalChain.java:32)
        at [email protected]/org.jabref.logic.ai.chatting.AiChatLogic.execute(AiChatLogic.java:168)
        at [email protected]/org.jabref.gui.ai.components.aichat.AiChatComponent.lambda$onSendMessage$9(AiChatComponent.java:204)
        at [email protected]/org.jabref.logic.util.BackgroundTask$1.call(BackgroundTask.java:73)
        at [email protected]/org.jabref.gui.util.UiTaskExecutor$1.call(UiTaskExecutor.java:191)
        at javafx.graphics@23/javafx.concurrent.Task$TaskCallable.call(Task.java:1401)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)

ThiloteE avatar Oct 01 '24 11:10 ThiloteE

I think there is also an issue when switching models. image Even after clearing the chat history, it is not possible to get around this erorr message, unless switching to a different entry in JabRef. Then I can chat again.

ThiloteE avatar Oct 01 '24 11:10 ThiloteE

@InAnYan

ThiloteE avatar Oct 01 '24 11:10 ThiloteE

Such an interesting issue and such an interesting behaviour...

IMHO it's very wrong that there is no OpenAI API compatible mode in gpt4all. (Standardization made people's life much easier).

However, I will look into this issue in more detail, because Gpt4All is a popular app and this is also important

InAnYan avatar Oct 01 '24 11:10 InAnYan

Image

This is so weird. I tried it last night and encountered the problem mentioned above, that is, the returned content would be truncated. But I tried it again this afternoon and found that it could be returned normally without truncating the content.

FeiLi-lab avatar Oct 15 '24 04:10 FeiLi-lab

Ok... I reproduced it. I will try to fix it.

FeiLi-lab avatar Oct 15 '24 05:10 FeiLi-lab

Oh, @FeiLi-lab, @ThiloteE when you have the issue with truncated output, could you try to click on text area?

Because there is some bug in the UI, when text are is not expanded. Could this be the case of truncated output?

InAnYan avatar Oct 15 '24 06:10 InAnYan

I will try to have a look at this on the weekend.

ThiloteE avatar Oct 17 '24 21:10 ThiloteE

Maybe could have been https://github.com/ggerganov/llama.cpp/pull/9867 in upstream llama.cpp too. The fix would need some time to reach downstream GPT4All.

ThiloteE avatar Oct 17 '24 21:10 ThiloteE

I am also working on this issue. I think the problem that responses look like cutoff in the middle may come from the request.

I ran the following two commands on my computer, one with max_token set and one without, and the result shows that the answer without max_token set was cutoff.

curl -X POST http://localhost:4891/v1/chat/completions -H "Content-Type: application/json" -d "{"model": "Phi-3.1-mini-128k-instruct-Q4_0-precise-output-tensor", "messages": [{"role": "user", "content": "could you please introduce more about your self?"}], "max_tokens": 2048, "temperature": 0.7}"

Image

curl -X POST http://localhost:4891/v1/chat/completions -H "Content-Type: application/json" -d "{"model": "Phi-3.1-mini-128k-instruct-Q4_0-precise-output-tensor", "messages": [{"role": "user", "content": "could you please introduce more about your self?"}], "temperature": 0.7}"

Image

I set the max_token in the code, now the response looks complete. preference: Image

JabRef: Image

@ThiloteE I will refine the code later if you can review it.

NoahXu718 avatar Oct 18 '24 15:10 NoahXu718

Oh, nice! Good to know. Yes, a pull-request would be nice, otherwise nobody can review. Do you think this is something we would need to add to the preferences?

ThiloteE avatar Oct 19 '24 21:10 ThiloteE

Welcome to the vibrant world of open-source development with JabRef!

Newcomers, we're excited to have you on board. Start by exploring our Contributing guidelines, and don't forget to check out our workspace setup guidelines to get started smoothly.

In case you encounter failing tests during development, please check our developer FAQs!

Having any questions or issues? Feel free to ask here on GitHub. Need help setting up your local workspace? Join the conversation on JabRef's Gitter chat. And don't hesitate to open a (draft) pull request early on to show the direction it is heading towards. This way, you will receive valuable feedback.

⚠ Note that this issue will become unassigned if it isn't closed within 30 days.

🔧 A maintainer can also add the Pinned label to prevent it from being unassigned automatically.

Happy coding! 🚀

github-actions[bot] avatar Oct 19 '24 22:10 github-actions[bot]