error, status code: 404, message: json: cannot unmarshal number into Go value of type openai.ErrorResponse
Describe the bug I'm getting ' ERROR There was a problem with the ollama API request.' when trying to use mods with Ollama
This used to work with no issues till recently, I don't know if there was an update on mods or Ollama that broke something.
Setup Please complete the following information along with version numbers, if applicable.
- macOS 14.6.1
- zsh 5.9 (arm-apple-darwin22.1.0)
- Alacritty 0.13.2 (1)
- tmux 3.4
- ollama version is 0.3.11
- mods version v1.6.0 (84099bd)
Steps to reproduce the behavior:
brew install mods ollama- and run:
mods -f "Hello, world" --api ollama --model llama3:latest
- this will give the bellow error:
ERROR There was a problem with the ollama API request.
error, status code: 404, message: json: cannot unmarshal number into Go value of type openai.ErrorResponse
I can confirm my Ollama setup is wroking:
curl http://127.0.0.1:11434/v1/models/llama3:latest
{"id":"llama3:latest","object":"model","created":1726132093,"owned_by":"library"}
curl -X POST http://127.0.0.1:11434/v1/completions -H "Content-Type: application/json" -d '{"model": "llama3:latest", "prompt": "Hello, world!", "max_tokens": 50}'
{"id":"cmpl-855","object":"text_completion","created":1727250150,"model":"llama3:latest","system_fingerprint":"fp_ollama","choices":[{"text":"Hello, world","index":0,"finish_reason":"stop"}],"usage":{"prompt_tokens":14,"completion_tokens":5,"total_tokens":19}}
see my mods.yml config file at the end of this post.
Expected behavior
- mods should use Ollama llama3 model and respond to the prompt
Additional context mods.yml
apis:
openai:
base-url: https://api.openai.com/v1
models:
gpt-4:
aliases: ["4"]
max-input-chars: 24500
fallback: gpt-3.5-turbo
gpt-4-32k:
aliases: ["32k"]
max-input-chars: 98000
fallback: gpt-4
gpt-3.5-turbo:
aliases: ["35t"]
max-input-chars: 12250
fallback: gpt-3.5
gpt-3.5:
aliases: ["35"]
max-input-chars: 12250
fallback:
localai:
base-url: http://localhost:8080
models:
ggml-gpt4all-j:
aliases: ["local", "4all"]
max-input-chars: 12250
fallback:
ollama:
base-url: http://127.0.0.1:11434/v1
api-key-env: NA
models:
"llama3:latest":
max-input-chars: 4000
default-model: llama3:latest
max-input-chars: 12250
format: false
quiet: false
temp: 1.0
topp: 1.0
no-limit: false
include-prompt-args: false
include-prompt: 0
max-retries: 5
fanciness: 10
status-text: Generating
Same here, I rolled back to v1.3.1 and it is working again.
I uninstalled the current 1.6.0 and tested all versions until I hit the first that worked again.
To install that version using go use go install github.com/charmbracelet/[email protected]
try changing ollama.base-url to http://127.0.0.1:11434/api instead of http://127.0.0.1:11434/v1. Worked for me. Ollama API doc for reference: https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-chat-completion
I think @thedenisnikulin comment is on point
closing