ollama-logseq
ollama-logseq copied to clipboard
API Endpoint `/api/generate` Not Supported by `llama.cpp` – Request for Compatibility
Details:
I am currently integrating GPU support for my projects using llama.cpp
, as it is the only solution supporting GPU in my environment. However, I've encountered an issue where the /api/generate
endpoint, which I believe is used by Ollama-Logseq and Copilot for Obsidian, is not supported by llama.cpp
.
Issue:
- When attempting to use
/api/generate
, the server returns a 404 error ({"error":{"code":404,"message":"File Not Found","type":"not_found_error"}}
). - Instead,
llama.cpp
uses the/v1/completions
endpoint for text generation.
Reference:
For more details on the correct API paths supported by llama.cpp
, please see their official API documentation.
Request:
Could you update the integrations or provide guidance on how to configure Ollama-Logseq and Copilot for Obsidian to work with the /v1/completions
endpoint? This would greatly help users like me who rely on llama.cpp
for GPU support.
Test
#!/bin/bash
# Base URL for llama.cpp server
BASE_URL="http://localhost:11434"
# Test /api/generate endpoint
echo "Testing /api/generate endpoint..."
curl -X POST "$BASE_URL/api/generate" \
-H "Content-Type: application/json" \
-d '{"model": "ggml-model-q8_0.gguf", "prompt": "Test prompt"}'
# Test /v1/completions endpoint
echo "Testing /v1/completions endpoint..."
curl -X POST "$BASE_URL/v1/completions" \
-H "Content-Type: application/json" \
-d '{"model": "ggml-model-q8_0.gguf", "prompt": "Test prompt", "max_tokens": 50}'
output
Testing /api/generate endpoint...
{"error":{"code":404,"message":"File Not Found","type":"not_found_error"}}Testing /v1/completions endpoint...
{"content":":\nWrite a letter to your friend describing your experience with a recent hike you went on.\nDear [Friend],\n\nI hope this letter finds you doing well. I wanted to share with you my recent experience on a hike that I went on last weekend.","id_slot":0,"stop":true,"model":"ggml-model-q8_0.gguf","tokens_predicted":50,"tokens_evaluated":3,"generation_settings":{"n_ctx":8192,"n_predict":-1,"model":"ggml-model-q8_0.gguf","seed":4294967295,"temperature":0.800000011920929,"dynatemp_range":0.0,"dynatemp_exponent":1.0,"top_k":40,"top_p":0.949999988079071,"min_p":0.05000000074505806,"tfs_z":1.0,"typical_p":1.0,"repeat_last_n":64,"repeat_penalty":1.0,"presence_penalty":0.0,"frequency_penalty":0.0,"penalty_prompt_tokens":[],"use_penalty_prompt_tokens":false,"mirostat":0,"mirostat_tau":5.0,"mirostat_eta":0.10000000149011612,"penalize_nl":false,"stop":[],"max_tokens":50,"n_keep":0,"n_discard":0,"ignore_eos":false,"stream":false,"logit_bias":[],"n_probs":0,"min_keep":0,"grammar":"","samplers":["top_k","tfs_z","typical_p","top_p","min_p","temperature"]},"prompt":"Test prompt","truncated":false,"stopped_eos":false,"stopped_word":false,"stopped_limit":true,"stopping_word":"","tokens_cached":52,"timings":{"prompt_n":3,"prompt_ms":245.916,"prompt_per_token_ms":81.972,"prompt_per_second":12.199287561606402,"predicted_n":50,"predicted_ms":6826.133,"predicted_per_token_ms":136.52266,"predicted_per_second":7.324791356980592}}
Thank you for your attention to this matter.