org-ai icon indicating copy to clipboard operation
org-ai copied to clipboard

Support `oobabooga/text-generation-webui` OpenAI API implementation.

Open sg1fan opened this issue 1 year ago • 2 comments

The old API from https://github.com/oobabooga/text-generation-webui is deprecated in support of their OpenAI API implementation, which is supposed to be a drop-in replacement. Simply setting the endpoints org-ai-openai-chat-endpoint and org-ai-openai-completion-endpoint almost works except the org-ai--insert-chat-completion-response function seems to assume that delta.role and delta.content are mutually exclusive. This might be the case for OpenAI's implemention of the OpenAI API, but the text-generation-webui implementation may have both fields be non-empty while also repeating the same role. There does not seem to be any indication in the OpenAI API reference that delta.role and delta.content need be mutually exclusive, so I don't think it would be correct to call this a bug on text-generation-webui's end.

Consequently, I get results like this:

#+begin_ai :max-tokens 5
This is a test.
[AI]: 
[AI]: 
[AI]: 
[AI]: 
[AI]: 
[AI]: 
[AI]: 
#+end_ai

With this as the request buffer:

data: {"id": "chatcmpl-1701438003005163520", "object": "chat.completions.chunk", "created": 1701438003, "model": "deepseek-coder-33b-instruct.Q4_K_M.gguf", "choices": [{"index": 0, "finish_reason": null, "message": {"role": "assistant", "content": ""}, "delta": {"role": "assistant", "content": ""}}]}

data: {"id": "chatcmpl-1701438003005163520", "object": "chat.completions.chunk", "created": 1701438003, "model": "deepseek-coder-33b-instruct.Q4_K_M.gguf", "choices": [{"index": 0, "finish_reason": null, "message": {"role": "assistant", "content": "I"}, "delta": {"role": "assistant", "content": "I"}}]}

data: {"id": "chatcmpl-1701438003005163520", "object": "chat.completions.chunk", "created": 1701438003, "model": "deepseek-coder-33b-instruct.Q4_K_M.gguf", "choices": [{"index": 0, "finish_reason": null, "message": {"role": "assistant", "content": "'"}, "delta": {"role": "assistant", "content": "'"}}]}

data: {"id": "chatcmpl-1701438003005163520", "object": "chat.completions.chunk", "created": 1701438003, "model": "deepseek-coder-33b-instruct.Q4_K_M.gguf", "choices": [{"index": 0, "finish_reason": null, "message": {"role": "assistant", "content": "m"}, "delta": {"role": "assistant", "content": "m"}}]}

data: {"id": "chatcmpl-1701438003005163520", "object": "chat.completions.chunk", "created": 1701438003, "model": "deepseek-coder-33b-instruct.Q4_K_M.gguf", "choices": [{"index": 0, "finish_reason": null, "message": {"role": "assistant", "content": " sorry"}, "delta": {"role": "assistant", "content": " sorry"}}]}

data: {"id": "chatcmpl-1701438003005163520", "object": "chat.completions.chunk", "created": 1701438003, "model": "deepseek-coder-33b-instruct.Q4_K_M.gguf", "choices": [{"index": 0, "finish_reason": null, "message": {"role": "assistant", "content": ","}, "delta": {"role": "assistant", "content": ","}}]}

data: {"id": "chatcmpl-1701438003005163520", "object": "chat.completions.chunk", "created": 1701438003, "model": "deepseek-coder-33b-instruct.Q4_K_M.gguf", "choices": [{"index": 0, "finish_reason": "length", "message": {"role": "assistant", "content": ""}, "delta": {"role": "assistant", "content": ""}}], "usage": {"prompt_tokens": 37, "completion_tokens": 6, "total_tokens": 43}}

sg1fan avatar Dec 01 '23 13:12 sg1fan

After trying with the settings,

(setq org-ai-openai-chat-endpoint "http://localhost:1234/v1/chat/completions")
(setq org-ai-openai-completion-endpoint "http://localhost:1234/v1/completions")

I can confirm it when testing with the server provided by lm-studio:

#+begin_ai
[SYS]: You are a helpful assistant.

[ME]: Hi

[AI]: 
[AI]: 
[AI]: 
[AI]: 
[AI]: 
[AI]: 
[AI]: 
[AI]: 
[AI]: 
[AI]: 

[ME]: 
#+end_ai

Maverobot avatar Dec 15 '23 05:12 Maverobot

I used the server provided with oobabooga and I get the same thing. Set the same API key in textwebui and emacs to replicate.

sigma-957 avatar Feb 14 '24 04:02 sigma-957

As a workaround, I changed the order of evaluation in org-ai--normalize-response for the openai part to first check for delta.content:

;; try openai streamed
(t (let ((choices (plist-get response 'choices)))
     (cl-loop for choice across choices
              append (or (when-let ((content (plist-get (plist-get choice 'delta) 'content)))
                           (list (make-org-ai--response :type 'text :payload content)))
                         (when-let ((finish-reason (plist-get choice 'finish_reason)))
                           (list (make-org-ai--response :type 'stop :payload finish-reason)))
                         (when-let ((role (plist-get (plist-get choice 'delta) 'role)))
                           (list (make-org-ai--response :type 'role :payload role)))
                         (when-let ((role (plist-get (plist-get choice 'delta) 'content)))
                           (list (make-org-ai--response :type 'text :payload role)))))))

Downside of this is that the [AI]: token is no longer inserted before the response.

dandersch avatar Nov 21 '24 07:11 dandersch