Ollama Embedding API Returns 404 - Wrong Endpoint Path
When attempting to use Ollama embedding models (both qwen3-embedding:latest and nomic-embed-text) for Vault QA indexing, Copilot sends requests to a malformed URL that results in 404 errors. Expected Behavior: Copilot should successfully call Ollama's embedding API at: http://localhost:11434/api/embeddings Actual Behavior: Copilot is calling a malformed URL with duplicate path segments: http://localhost:11434/api/embeddings/api/embed This returns: 404 (Not Found)
teps to Reproduce:
Install and start Ollama with OLLAMA_ORIGINS="app://obsidian.md*" ollama serve Pull embedding model: ollama pull nomic-embed-text Verify model works via curl:
bash curl http://localhost:11434/api/embeddings -d '{ "model": "nomic-embed-text", "prompt": "test text" }' ✅ This works correctly and returns embeddings 4. In Obsidian Copilot Settings:
Set Base URL: http://localhost:11434 Add Custom Model: nomic-embed-text Provider: ollama
Attempt to index vault in Vault QA mode Check Developer Console (Ctrl+Shift+I)
Console Error: POST http://localhost:11434/api/embeddings/api/embed 404 (Not Found) Getting text from response
Batch processing error: {error: ResponseError: failed to encode response: json: unsupported value: -Inf} Screenshots: [Screenshot showing the error in console with the duplicate path visible] Root Cause Analysis: It appears Copilot is:
Taking the base URL: http://localhost:11434 Incorrectly appending: /api/embeddings/api/embed
This suggests either:
The base URL configuration is being set/stored as http://localhost:11434/api/embeddings internally OR Copilot is appending both /api/embeddings AND /api/embed to the base URL
Related Issues: This appears related to Issue #1087 which identified that Copilot uses /api/embed instead of the correct Ollama endpoint /api/embeddings. However, in my case, BOTH paths are being concatenated. Correct Ollama API Endpoints: According to Ollama API documentation:
Embeddings: POST /api/embeddings ✅ NOT: POST /api/embed ❌
Workaround Attempted: I've verified the model works correctly when called directly via curl, confirming this is a Copilot integration issue, not an Ollama issue.
@logancyang Logan, need your confirmation on this. Might be related to some package that we use?
@palyam your referenced issue has the answer https://github.com/logancyang/obsidian-copilot/issues/1087#issuecomment-2608233798
Ollama supports both embed and embeddings internally. Copilot is using the ollama client from langchainjs.
http://localhost:11434/api/embeddings/api/embed looks wrong, can you show us the screenshot of your ollama model in the "add custom model" form?
I believe I’m experiencing this same issue. or a variation of it maybe.
Setup: – macOS (MacBook Pro 14” – Apple M3, 16 GB unified memory) – Ollama running locally with OLLAMA_FLASH_ATTENTION=true – Main model: qwen2.5:7b-instruct-q4_K_M – Embedding model: nomic-embed-text:latest – Base URL in Copilot: http://localhost:11434/v1/ – Using the Copilot plugin in Obsidian (latest community version)
Copilot successfully indexes most of my notes, but the Developer Console repeatedly logs:
POST http://localhost:11434/api/embed 500 (Internal Server Error)
even though some notes do get embedded successfully.
Example log sequence:
plugin:copilot:382 POST http://localhost:11434/api/embed 500 (Internal Server Error) plugin:copilot:643 Error indexing file Journal/01 Daily/2025/10/2025-10-23.md: ResponseError: do embedding request: Post "http://127.0.0.1:52462/embedding": EOF
When I test the embedding endpoint manually, Ollama works perfectly:
curl http://localhost:11434/v1/embeddings
-H "Content-Type: application/json"
-d '{"model":"nomic-embed-text:latest","input":"test embedding"}'
→ returns a valid embedding vector.
So it seems Copilot is still hitting /api/embed instead of /v1/embeddings, which causes intermittent 500 errors. Lowering the embedding batch size improved stability but didn’t resolve it completely.
Can you confirm if this fix (using the correct /v1/embeddings route for Ollama) has been merged, or if there’s a known workaround in the current release?
Hi, I'm hitting the second issue aswell. However the problem itself seems to be different from the url. It's well explained here: https://github.com/ollama/ollama/issues/7288
If I got this right it occurs when the chunk has more tokens than the embedding model vector size.
This is a proposed fix:
Settings.embedModel = new OllamaEmbedding({
model: 'granite-embedding:278m',
options: {
num_ctx: 512, // <-- This fixed my issue.
},
});