continue icon indicating copy to clipboard operation
continue copied to clipboard

Continue.dev is not showing any response from local openai compatible API.

Open getkimb opened this issue 1 year ago • 1 comments

Before submitting your bug report

Relevant environment info

- OS: Mac OS
- Continue: 0.0.62
- IDE: PyCharm (Jetbrains)
- Model: Using GPT4All (OpenAI compatible server)
- config.json:
  
{
  "models": [
    {
      "title": "GPT-4o (Free Trial)",
      "provider": "free-trial",
      "model": "gpt-4o",
      "systemMessage": "You are an expert software developer. You give helpful and concise responses."
    },
    {
      "title": "Llama3 70b (Free Trial)",
      "provider": "free-trial",
      "model": "llama3-70b",
      "systemMessage": "You are an expert software developer. You give helpful and concise responses. Whenever you write a code block you include the language after the opening ticks."
    },
    {
      "title": "Codestral (Free Trial)",
      "provider": "free-trial",
      "model": "codestral"
    },
    {
      "title": "Claude 3 Sonnet (Free Trial)",
      "provider": "free-trial",
      "model": "claude-3-sonnet-20240229"
    },
    {
      "model": "AUTODETECT", // this one is the local server
      "title": "OpenAI",
      "apiBase": "http://localhost:4891/v1/",
      "provider": "openai",
      "systemMessage": "You are an expert software developer. You give helpful and concise responses."
    }
  ],
  "customCommands": [
    {
      "name": "test",
      "prompt": "{{{ input }}}\n\nWrite a comprehensive set of unit tests for the selected code. It should setup, run tests that check for correctness including important edge cases, and teardown. Ensure that the tests are complete and sophisticated. Give the tests just as chat output, don't edit any file.",
      "description": "Write unit tests for highlighted code"
    }
  ],
  "tabAutocompleteModel": {
    "title": "Starcoder2 3b",
    "provider": "ollama",
    "model": "starcoder2:3b"
  },
  "contextProviders": [
    {
      "name": "diff",
      "params": {}
    },
    {
      "name": "folder",
      "params": {}
    },
    {
      "name": "codebase",
      "params": {}
    }
  ],
  "slashCommands": [
    {
      "name": "edit",
      "description": "Edit selected code"
    },
    {
      "name": "comment",
      "description": "Write comments for the selected code"
    },
    {
      "name": "share",
      "description": "Export the current chat session to markdown"
    },
    {
      "name": "commit",
      "description": "Generate a git commit message"
    }
  ]
}

Description

I am using GPT4All to load the context of the codebase, I am using Llama 3.1 8B Instruct 128k model within GPT4All using nomic embeddings. The gpt4all server runs using the instructions here, https://github.com/nomic-ai/gpt4all/wiki/Local-API-Server#enabling-localdocs.

Continue.dev can automatically detect the model provided "http://localhost:4891/v1/" URL in config. I can see in GPT4ALL logs, it is receiving the query and generating the response. But continue.dev blanks out on response, both for inline as well as sidebar chat. Screenshot 2024-08-24 at 10 30 40 AM Screenshot 2024-08-24 at 10 25 02 AM

To reproduce

  1. Install Gpt4All. Enable localServer under settings.
  2. You don't necessarily need to add localDocs.
  3. Configure continue.dev to use openai compatible server.
  4. Ask anything, no response is shown.

Log output

➜  logs cat core.log| tail -n 50

// core.log is blank


➜  logs cat prompt.log| tail -n 50
Settings:
contextLength: 4096
model: Llama 3.1 8B Instruct 128k
maxTokens: 1024
log: undefined

############################################

<system>
You are an expert software developer. You give helpful and concise responses.

<user>
hello



Completion:

I went on test it with openai-python package.

from openai import OpenAI

endpoint = "http://localhost:4891/v1"
api_key = 'YOUR_API_KEY' # You need to replace with Your actual API key

client = OpenAI(
    base_url=endpoint,
    api_key=api_key
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Hello",
        }
    ],
    model="Llama 3.1 8B Instruct 128k"
)
print(chat_completion)

I am getting chat completion response in console.

git:(main) ✗ python3 test.py
ChatCompletion(id='foobarbaz', choices=[Choice(finish_reason='length', index=0, logprobs=None, message=ChatCompletionMessage(content='How are you doing today? Is there something I can help or talk about with', refusal=None, role='assistant', function_call=None, tool_calls=None), references=[])], created=1724476577, model='Llama 3.1 8B Instruct 128k', object='text_completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=16, prompt_tokens=12, total_tokens=28))

getkimb avatar Aug 24 '24 05:08 getkimb