[BUG] Wrong ollama embedding endpoint
Description
Hi, I think you are calling the wrong endpoint for local embedding for ollama, if I use settings from your instructions here
From official ollama api documentation here it should be called: http://localhost:11434/api/embedd endpoint, but from koteamon it is called http://localhost:11434/api/embeddings
following works:
curl http://localhost:11434/api/embedd -d '{
"model": "
following does not:
curl http://localhost:11434/api/embeddings-d '{
"model": "
There is also a problem that on UI we don't get any notification about the issues and one has to look into the logs. It would be great if it could be a little bit more explicit.
Reproduction steps
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
Screenshots

Logs
No response
Browsers
No response
OS
No response
Additional information
No response
So the examples we used following Ollama OpenAI API specs https://github.com/ollama/ollama/blob/main/docs/openai.md#curl
Please use the Test connection feature to make sure the Ollama connection working properly for both LLM & embedding models.
It worked before, but ollama changed endpoints from /embeddings to /embed, so OpenAI client should not work anymore because it uses /embeddings. At least this is my understanding.
The openAI client endpoint:
Same issue here with ollama -v ollama version is 0.4.1 and kotaemon full docker image
+1
Is this just an issue with Ollama v0.4?
ollama -v
ollama version is 0.3.9
Call from within Kotaemon app docker runtime:
root@justin-two-towers:/app# curl localhost:11434/api/embeddings -d '{ "model": "llama3.1:8b", "input": "Why is the sky blue?" }'
{"embedding":[]}
Seems fine ...
@r0kk is correct. I did a test just now on a Qwen2-7B-Instruct model that I wanted to use for embedding. It is running on-prem in a separate ollama docker container. https://ollama.com/library/qwen2.5:7b
Steps I took to reproduce:
# to prep to execute a command in docker
docker exec -it ollama /bin/bash
# install curl inside docker
apt-get update
apt-get install -y curl
# the /embeddings endpoint returned an empty array
curl -X POST http://localhost:11434/api/embeddings -H "Content-Type: application/json" -d '{
"model": "qwen2.5:7b",
"input": ["Why is the sky blue?", "Why is the grass green?"]
}'
> {"embedding":[]}
# but the /embed endpoint succeeded
curl -X POST http://localhost:11434/api/embed \
-H "Content-Type: application/json" \
-d '{
"model": "qwen2.5:7b",
"input": ["Why is the sky blue?", "Why is the grass green?"]
}'
# output was {"model":"qwen2.5:7b","embeddings":
> [[-0.0009619982,0.011638082,0.0019495427,0.0025457952,-0.0059983553,0.0056869467,0.0049384357,0.0036752485,-0.0041196514,0.031427655,-0.010701078,0.0053785336,0.0060591 ......
The reason why the default nomic-embed-text works with the /embedding endpoint is just because it happens to use the old /embeddings endpoint as evidenced here: https://ollama.com/library/nomic-embed-text
@vap0rtranz I think a difference is that the /embeddings endpoint expects a prompt, whereas /embed expects an input.
I think we both accidentally passed input when we should have passed prompt.
https://github.com/ollama/ollama/blob/main/docs/api.md#generate-embedding
When I tried to swap out the embedding model from nomic-embed-text to qwen2.5:7b running on ollama on-prem, I ran into this error for POST /v1/embeddings: "llm embedding error: Failed to create new sequence: no input provided"
This actually reminds me of the problem we are noticing here where the /embeddings endpoint expects a prompt, not an endpoint.
Actually, I think we are all looking at the wrong endpoint. This other thread talks about the difference between /api/embeddings vs /v1/embeddings: https://github.com/ollama/ollama/issues/7242.
I think that Kotaemon is using the /v1, as evidenced in the screenshot at the top of this post.