Unable to use Ollama Open WebUI's reverse Proxy
Describe the bug
When defining a custom URL for Ollama WebUI I am getting an OpenAI API Error. Error calling OpenAPI: error calling openai API: error, invalid character 'I' looking for beginning of value
To Reproduce Steps to reproduce the behavior:
- Create your own Ollama instance using Open WebUI
- Add it to Wave using
https://ai.host.tld/ollama/v1including the AI Token and model - send a Wave AI Message
- See error
Expected behavior It should work
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: Linux x86
- Version idk where to find the version
Additional context When sending a query via curl it works correctly:
curl https://ai.host.tld/ollama/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer --REDACTED--" -d '{
"model": "mistral:latest",
"messages": [
{
"role": "system",
"content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
},
{
"role": "user",
"content": "Compose a poem that explains the concept of recursion in programming."
}
]
}'
I am running into this issue as well, however, I have an exposed api endpoint for my loaded models using a combination of LM Studio (<machine IP>:1234) and Ollama (<machine IP>:11434).
I have tried setting the ai:baseurl string to my endpoint ports and even included an api key for testing. I am currently running Arch on my workstation and have a 2U server running Ubuntu 23.04 with multiple gpus to run larger models. I tried serving an api endpoint using a flash server instance on my Ubuntu server and pointed the baseurl to that, but I am still receiving a similar error to idoodler.
error calling openai API: error, status code: 404, status: 404 Not Found, message: invalid character 'p' after top-level value, body: 404 page not found
I was able to fix the issue. The format used to add Ollama support is as follows:
`
"[email protected]": {
"display:name": "ollama - llama3.2",
"display:order": 1,
"ai:*": true,
"ai:baseurl": "http://localhost:11434/v1",
"ai:model": "llama3.2:3b"
"ai:maxtokens": 4096,
}
`
I added this to the bottom of my ~/.config/waveterm/presets/ai.json file. Took a little brain breaking after finding the removed code from a previous commit. Then I was able to add multiple models and get it see the models on both my workstation and my server.
To make it work with LM Studio, just change the baseurl to http://localhost:1234/v1/chat/completions, or whatever endpoints and port you want.