vim-ai icon indicating copy to clipboard operation
vim-ai copied to clipboard

o1-mini / o1-preview does not work

Open dfadev opened this issue 1 year ago • 2 comments

some notes for you re: the o1 models and vim-ai

  • o1 only supports temperature: 1
  • o1 doesn't support system role
  • o1 doesn't support max_tokens: null
  • o1 doesn't support stream: true

First three I can bypass, but non-stream mode is a stopper.

I think openai will add this support later, so could just wait for them.

dfadev avatar Sep 24 '24 03:09 dfadev

Sound like a matter of conditional renaming in https://github.com/madox2/vim-ai/blob/758be522e6d765eeb78ce7681f4b39e3b05043b8/py/utils.py#L53 One could think of simply passing the options declared in Vim to be more flexible to API changes of the many models.

Konfekt avatar Sep 25 '24 04:09 Konfekt

Yeah it's the stream: false part that is tripping me up. It's not as simple as just changing the option, there appear to be logic changes needed as well because in non-stream mode you have to poll to get the completion.

dfadev avatar Sep 25 '24 13:09 dfadev

I have just prototyped non streaming support on the branch: support-non-streaming, but I didn't have time to properly test. Let me know if that works for you

madox2 avatar Oct 08 '24 21:10 madox2

nice! I had to do this:

diff --git a/py/utils.py b/py/utils.py
index 381ba75..4869398 100644
--- a/py/utils.py
+++ b/py/utils.py
@@ -51,12 +51,17 @@ def normalize_config(config):
 
 def make_openai_options(options):
     max_tokens = int(options['max_tokens'])
-    return {
+    max_completion_tokens = int(options['max_completion_tokens'])
+    result = {
         'model': options['model'],
-        'max_tokens': max_tokens if max_tokens > 0 else None,
         'temperature': float(options['temperature']),
         'stream': int(options['stream']) == 1,
     }
+    if max_tokens > 0:
+        result['max_tokens'] = max_tokens
+    if max_completion_tokens > 0:
+        result['max_completion_tokens'] = max_completion_tokens
+    return result
 
 def make_http_options(options):
     return {

because it didn't want max_tokens any more, instead o1 requires max_completion_tokens. It complains even with null for max_tokens.

Also needed to set the initial prompt to >>> user instead of >>> system and set temperature to 1 in init.vim:

let initial_prompt =<< trim END
>>> user

You are a completion engine with following parameters:
Task: Provide compact code/text completion, generation, transformation or explanation
Topic: general programming and text editing
Style: Plain result without any commentary, unless commentary is necessary.  Don't use semicolons for javascript.
Audience: Users of text editor and programmers that need to transform/generate text
END

let chat_engine_config = {
\  "engine": "chat",
\  "options": {
\    "stream": 0,
\    "model": "o1-mini",
\    "max_tokens": 0,
\    "max_completion_tokens": 25000,
\    "temperature": 1,
\    "request_timeout": 120,
\    "selection_boundary": "",
\    "initial_prompt": initial_prompt,
\  },
\  "ui": {
\    "open_chat_command": "preset_below",
\    "scratch_buffer_keep_open": 0,
\    "populate_options": 0,
\    "code_syntax_enabled": 1,
\    "paste_mode": 1,
\    "show_initial_prompt": 1,
\  },
\}

sample output:

>>> user

how many R's in strawberry?

<<< assistant

There are three **R**'s in "strawberry".

sample o1-mini response:

{
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "There are three **R**'s in \"strawberry\".",
                "refusal": null,
                "role": "assistant"
            }
        }
    ],
    "created": 1728788354,
    "id": "chatcmpl-AHj74vV1jBJoNRMWk6dSOaBHl0jlT",
    "model": "o1-mini-2024-09-12",
    "object": "chat.completion",
    "system_fingerprint": "fp_692002f015",
    "usage": {
        "completion_tokens": 601,
        "completion_tokens_details": {
            "reasoning_tokens": 576
        },
        "prompt_tokens": 98,
        "prompt_tokens_details": {
            "cached_tokens": 0
        },
        "total_tokens": 699
    }
}

o1-preview also works:

{
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "3",
                "refusal": null,
                "role": "assistant"
            }
        }
    ],
    "created": 1728788678,
    "id": "chatcmpl-AHjCIeW5SzrPYWYV2nuVxzVVLuIgU",
    "model": "o1-preview-2024-09-12",
    "object": "chat.completion",
    "system_fingerprint": "fp_49f580698f",
    "usage": {
        "completion_tokens": 1355,
        "completion_tokens_details": {
            "reasoning_tokens": 1344
        },
        "prompt_tokens": 102,
        "prompt_tokens_details": {
            "cached_tokens": 0
        },
        "total_tokens": 1457
    }
}

dfadev avatar Oct 13 '24 03:10 dfadev

Thanks @dfadev I applied your patch and it works well. Now it is possible to use o1 models. This is an example how to do it with a role:

[o1-mini]
[o1-mini.options-chat]
stream = 0
model = o1-mini
max_tokens = 0
max_completion_tokens = 25000
temperature = 1
initial_prompt =
  >>> user
  You are a general assistant.
  If you attach a code block add syntax type after ``` to enable syntax highlighting.

madox2 avatar Dec 03 '24 21:12 madox2