llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Bug: embedding endpoint and openai compatibility with input as list

Open gelim opened this issue 1 year ago • 2 comments

Hello,

When sending the following request (working well on OpenAI endpoints) Llama.cpp spit out an error.

Run the server:

$ llama-server -m bge-large-en-v1.5-334M-Q8_0.gguf  --host 127.0.0.1 --port 8080 --api-key xxx --embedding --embd-output-format json

Query:

POST /v1/embeddings HTTP/1.1
Host: 127.0.0.1:8080
Accept-Encoding: gzip, deflate
Connection: close
Accept: application/json
Content-Type: application/json
Authorization: Bearer xxxxx
X-Stainless-Async: false
Content-Length: 85

{"input": [[15339, 1917]], "model": "bge-large-en-v1.5", "encoding_format": "base64"}

Answer:

HTTP/1.1 400 Bad Request
Access-Control-Allow-Origin: 
Connection: close
Content-Length: 117
Content-Type: application/json; charset=utf-8
Server: llama.cpp

{"error":{"code":400,"message":"\"prompt\" must be a string or an array of integers","type":"invalid_request_error"}}

Issue: Llama.cpp is strictly taking a list of integers and does not allow list of list as openai-compatible clients sends out.

Name and Version

$ llama-server --version version: 3486 (0832de72) built with cc (Ubuntu 10.5.0-1ubuntu1~22.04) 10.5.0 for x86_64-linux-gnu

What operating system are you seeing the problem on?

Linux

Relevant log output

No response

gelim avatar Aug 06 '24 15:08 gelim

Naive attempt to fix the issue:

diff --git a/examples/server/server.cpp b/examples/server/server.cpp
index 7813a295..e9889594 100644
--- a/examples/server/server.cpp
+++ b/examples/server/server.cpp
@@ -969,6 +969,8 @@ struct server_context {
                 (prompt->is_array() &&  prompt->size() == 1 && prompt->at(0).is_string()) ||
                 (prompt->is_array() && !prompt->empty()     && prompt->at(0).is_number_integer())) {
                 slot.prompt = *prompt;
+           } else if (prompt->is_array() && prompt->size() == 1 && prompt->at(0).is_array()) {
+               slot.prompt = prompt->at(0);
             } else {
                 send_error(task, "\"prompt\" must be a string or an array of integers", ERROR_TYPE_INVALID_REQUEST);
                 return false;

gelim avatar Aug 07 '24 11:08 gelim

PR welcome

ggerganov avatar Aug 08 '24 09:08 ggerganov

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Sep 23 '24 01:09 github-actions[bot]