org-ai
org-ai copied to clipboard
Support `oobabooga/text-generation-webui` OpenAI API implementation.
The old API from https://github.com/oobabooga/text-generation-webui is deprecated in support of their OpenAI API implementation, which is supposed to be a drop-in replacement. Simply setting the endpoints org-ai-openai-chat-endpoint
and org-ai-openai-completion-endpoint
almost works except the org-ai--insert-chat-completion-response
function seems to assume that delta.role
and delta.content
are mutually exclusive. This might be the case for OpenAI's implemention of the OpenAI API, but the text-generation-webui
implementation may have both fields be non-empty while also repeating the same role. There does not seem to be any indication in the OpenAI API reference that delta.role
and delta.content
need be mutually exclusive, so I don't think it would be correct to call this a bug on text-generation-webui
's end.
Consequently, I get results like this:
#+begin_ai :max-tokens 5
This is a test.
[AI]:
[AI]:
[AI]:
[AI]:
[AI]:
[AI]:
[AI]:
#+end_ai
With this as the request buffer:
data: {"id": "chatcmpl-1701438003005163520", "object": "chat.completions.chunk", "created": 1701438003, "model": "deepseek-coder-33b-instruct.Q4_K_M.gguf", "choices": [{"index": 0, "finish_reason": null, "message": {"role": "assistant", "content": ""}, "delta": {"role": "assistant", "content": ""}}]}
data: {"id": "chatcmpl-1701438003005163520", "object": "chat.completions.chunk", "created": 1701438003, "model": "deepseek-coder-33b-instruct.Q4_K_M.gguf", "choices": [{"index": 0, "finish_reason": null, "message": {"role": "assistant", "content": "I"}, "delta": {"role": "assistant", "content": "I"}}]}
data: {"id": "chatcmpl-1701438003005163520", "object": "chat.completions.chunk", "created": 1701438003, "model": "deepseek-coder-33b-instruct.Q4_K_M.gguf", "choices": [{"index": 0, "finish_reason": null, "message": {"role": "assistant", "content": "'"}, "delta": {"role": "assistant", "content": "'"}}]}
data: {"id": "chatcmpl-1701438003005163520", "object": "chat.completions.chunk", "created": 1701438003, "model": "deepseek-coder-33b-instruct.Q4_K_M.gguf", "choices": [{"index": 0, "finish_reason": null, "message": {"role": "assistant", "content": "m"}, "delta": {"role": "assistant", "content": "m"}}]}
data: {"id": "chatcmpl-1701438003005163520", "object": "chat.completions.chunk", "created": 1701438003, "model": "deepseek-coder-33b-instruct.Q4_K_M.gguf", "choices": [{"index": 0, "finish_reason": null, "message": {"role": "assistant", "content": " sorry"}, "delta": {"role": "assistant", "content": " sorry"}}]}
data: {"id": "chatcmpl-1701438003005163520", "object": "chat.completions.chunk", "created": 1701438003, "model": "deepseek-coder-33b-instruct.Q4_K_M.gguf", "choices": [{"index": 0, "finish_reason": null, "message": {"role": "assistant", "content": ","}, "delta": {"role": "assistant", "content": ","}}]}
data: {"id": "chatcmpl-1701438003005163520", "object": "chat.completions.chunk", "created": 1701438003, "model": "deepseek-coder-33b-instruct.Q4_K_M.gguf", "choices": [{"index": 0, "finish_reason": "length", "message": {"role": "assistant", "content": ""}, "delta": {"role": "assistant", "content": ""}}], "usage": {"prompt_tokens": 37, "completion_tokens": 6, "total_tokens": 43}}
After trying with the settings,
(setq org-ai-openai-chat-endpoint "http://localhost:1234/v1/chat/completions")
(setq org-ai-openai-completion-endpoint "http://localhost:1234/v1/completions")
I can confirm it when testing with the server provided by lm-studio:
#+begin_ai
[SYS]: You are a helpful assistant.
[ME]: Hi
[AI]:
[AI]:
[AI]:
[AI]:
[AI]:
[AI]:
[AI]:
[AI]:
[AI]:
[AI]:
[ME]:
#+end_ai
I used the server provided with oobabooga and I get the same thing. Set the same API key in textwebui and emacs to replicate.
As a workaround, I changed the order of evaluation in org-ai--normalize-response
for the openai part to first check for delta.content
:
;; try openai streamed
(t (let ((choices (plist-get response 'choices)))
(cl-loop for choice across choices
append (or (when-let ((content (plist-get (plist-get choice 'delta) 'content)))
(list (make-org-ai--response :type 'text :payload content)))
(when-let ((finish-reason (plist-get choice 'finish_reason)))
(list (make-org-ai--response :type 'stop :payload finish-reason)))
(when-let ((role (plist-get (plist-get choice 'delta) 'role)))
(list (make-org-ai--response :type 'role :payload role)))
(when-let ((role (plist-get (plist-get choice 'delta) 'content)))
(list (make-org-ai--response :type 'text :payload role)))))))
Downside of this is that the [AI]:
token is no longer inserted before the response.