gp.nvim icon indicating copy to clipboard operation
gp.nvim copied to clipboard

feat: provide alternative local/offline solution

Open Robitx opened this issue 1 year ago • 10 comments

  • https://stability.ai/
  • https://ollama.ai/

Ideally via dockers providing callable API.

Robitx avatar Nov 12 '23 14:11 Robitx

At least for the "chat" function, this already works. I've tested the plugin with OpenChat 3.5 running with llama-cpp-python, which has an OpenAI compatible web server. Here's my config:

    local default_config = require('gp.config')
    local agents = { unpack(default_config.agents), {
        name = "OpenChat3-5",
        chat = true,
        command = false,
        model = { model = 'openchat_3.5.Q6_K', temperature = '0.5', top_p = 1 },
        system_prompt = default_config.agents[1].system_prompt
      }
    }
    gp.setup {
      openai_api_key = 'dummy',
      openai_api_endpoint = 'http://127.0.0.1:8000/v1/chat/completions',
      cmd_prefix = 'Gp',
      agents = agents,
      chat_topic_gen_model = 'openchat_3.5.Q6_K'
    }

Note that the model parameter is ignored by llama-cpp-python.

I'm not familiar with all the plugin features, but I will be sure to try the "command" later.

tarruda avatar Dec 02 '23 14:12 tarruda

https://github.com/Mozilla-Ocho/llamafile https://github.com/deepseek-ai/DeepSeek-Coder

Robitx avatar Dec 18 '23 07:12 Robitx

EDIT: this was user error, please disregard.

I'm trying this out with LM Studio and it almost works. LM Studio is throwing an error because the content value for the assistant is empty.

[2024-01-30 20:43:15.630] [INFO] Received POST request to /v1/chat/completions with body: {
  "model": "gpt-3.5-turbo-16k",
  "stream": true,
  "messages": [
    {
      "role": "system",
      "content": "You are a general AI assistant.\n\nThe user provided the additional info about how they would like you to respond:\n\n- If you're unsure don't guess and say you don't know instead.\n- Ask question if you need clarification to provide better answer.\n- Think deeply and carefully from first principles step by step.\n- Zoom out first to see the big picture and then zoom in to details.\n- Use Socratic method to improve your thinking and coding skills.\n- Don't elide any code from your output if the answer requires coding.\n- Take a deep breath; You've got this!"
    },
    {
      "role": "user",
      "content": "Tell me a joke"
    },
    {
      "role": "assistant",
      "content": ""
    },
    {
      "role": "user",
      "content": "Summarize the topic of our conversation above in two or three words. Respond only with those words."
    }
  ]
}
[2024-01-30 20:43:15.630] [ERROR] [Server Error] {"title":"'messages' array must only contain objects with a 'content' field that is not empty"}

I'm not sure that I have any influence over this in terms of configuration.

johnallen3d avatar Jan 31 '24 22:01 johnallen3d

@johnallen3d Hey, this seems weird, I've just tried lmstudio (on linux) against the main branch by just changing openai_api_endpoint = "http://localhost:1234/v1/chat/completions", and it works fine.

If the chat # topic header is not set, the plugin makes two calls. The first one for providing the answer to a user message and a second one generating topic name. The call you've provided seems to be a the one for generating chat # topic header, while the empty content from assistant should be filled during the first call to the endpoint.

Could you provide more info? Log for the first call if visible, result of :GpInspectPlugin after the failure, OS you're using.

I'm slowly cooking support for multiple providers #93 currently working with openAI compatible APIs, Copilot endpoint and Ollama - just added the LMStudio and seems to work in that branch as well.

Robitx avatar Jan 31 '24 23:01 Robitx

Ok, @Robitx, I tried this again and it's working fine. 🤦‍♂️ Sorry for the false report, I'll edit my comment to clarify that this was my mistake.

Thanks for your work on this! Looking forward to #93!

johnallen3d avatar Feb 01 '24 00:02 johnallen3d

Another potential backend for you to consider @Robitx.

https://github.com/TabbyML/tabby

It looks like the have or are working towards an OpenAI compatible API. The nice thing here is that you could use tabby serve to serve up model(s) for inline tabby completions (ala Copilot) and for gp.nvim style chats etc.

johnallen3d avatar Feb 01 '24 16:02 johnallen3d

I have tried with Ollama and it works perfectly. It seems that all that is really needed to turn this plugin into a multi client is the ability to overwrite the default api key and endpoint on an agent per agent basis. It would make it even more awesome than it already is.

Right now it is possible to write a hook that would do that but it wouldn't play nicely with the agent commands. One would have to effectively wrap them to do the key + endpoint change as well as the actual agent change.

helins avatar Feb 15 '24 15:02 helins

So if I was using Ollama, I'd just set the openai api endpoint to something like http://localhost:11434/api/generate, right?

bmikaili avatar Feb 27 '24 08:02 bmikaili

@Robitx would would need to be done to offer any OpenAI compatible alternatives to be supported? I'm guessing exposing a open_ai_url on the agent level (defaulting to OpenAI's API) so you could do it on a per agent basis (even mix solutions if you want).

I'd be happy to help, I'm really enjoying using this plugin and would love to experiment with open-source models!

joshmedeski avatar Mar 27 '24 22:03 joshmedeski

@joshmedeski it's being done at https://github.com/Robitx/gp.nvim/pull/93

teto avatar Mar 30 '24 15:03 teto

Feel free to close, it's merged.

primeapple avatar Jul 11 '24 10:07 primeapple