gptel
gptel copied to clipboard
Contact to llamafile-AI on server fails
I have two AIs set up, one on my laptop and one on my desktop:
(use-package gptel
:config
(gptel-make-openai "testai" ;Any name
:stream t ;Stream responses
:protocol "http"
:host "localhost:8080" ;Llama.cpp server location
:models '("test")
:key nil)
(gptel-make-openai "desktop" ;Any name
:stream t ;Stream responses
:protocol "http"
:host "1.0.0.8:8080" ;Llama.cpp server location
:models '("test")
:key nil)
;; (setq-default
;; gptel-model "test"
;; gptel-backend (gptel-make-openai "testai"
;; :stream t
;; :protocol "http"
;; :host "localhost:8080"
;; :models '("test")))
(setq-default
gptel-model "test"
gptel-backend (gptel-make-openai "desktop"
:stream t
:protocol "http"
:host "10.0.0.8:8080"
:models '("test"))))
If I use the local machine with the defaults that are here commented out, it works. If I try to use the desktop-AI, it yields the following error:
desktop response error: ((c4bb9327bb265bd639a950ab5ffe93f8 . 0)) Could not parse HTTP response.
The following shell-script, which copies a file and feeds it to the desktop AI works though:
#!/bin/bash
scp $1 [email protected]:/home/alex/wizard
ssh [email protected] "sh ~/.local/bin/wizardcoder-python-34b-v1.0.Q5_K_M.llamafile /home/alex/wizard/$1"
What's the problem?
I'm assuming you're using the server llamafile in your desktop and not the other one.
Try looking at the request log:
- Run
(setq gptel-log-level 'debug)
- Try to use the desktop llamafile and produce the error
- Look at the
*gptel-log*
buffer. The curl command the HTTP response should be present. You can paste that here.
Here is the log:
{
"gptel": "request headers",
"timestamp": "2024-03-08 00:42:48"
}
{
"Content-Type": "application/json"
}
{
"gptel": "request body",
"timestamp": "2024-03-08 00:42:48"
}
{
"model": "test",
"messages": [
{
"role": "system",
"content": "You are a large language model living in Emacs and a helpful assistant. Respond concisely."
},
{
"role": "user",
"content": "Can you hear me?"
}
],
"stream": false,
"temperature": 1.0
}
{
"gptel": "request Curl command",
"timestamp": "2024-03-08 00:42:48"
}
[
"curl",
"--disable",
"--location",
"--silent",
"--compressed",
"-XPOST",
"-y300",
"-Y1",
"-D-",
"-w(75aecd991c05b7de7a6e566cc05016ad . %{size_header})",
"-d{\"model\":\"test\",\"messages\":[{\"role\":\"system\",\"content\":\"You are a large language model living in Emacs and a helpful assistant. Respond concisely.\"},{\"role\":\"user\",\"content\":\"Can you hear me?\"}],\"steam\":false,\"temperature\":1.0}",
"-HContent-Type: application/json",
"http://localhost:8080/v1/chat/completions"
@nameiwillforget this looks incomplete, did you grab everything in the log buffer?
Yes, but there was another gptel-buffer, gptel-curl:
HTTP/1.1 200 OK
Access-Control-Allow-Origin:
Content-Type: text/event-stream
Keep-Alive: timeout=5, max=5
Server: llama.cpp
Transfer-Encoding: chunked
(d062d386c408445be36c4ba19bd78419 . 160)
[ "curl", "--disable", "--location", "--silent", "--compressed", "-XPOST", "-y300", "-Y1", "-D-", "-w(75aecd991c05b7de7a6e566cc05016ad . %{size_header})", "-d{\"model\":\"test\",\"messages\":[{\"role\":\"system\",\"content\":\"You are a large language model living in Emacs and a helpful assistant. Respond concisely.\"},{\"role\":\"user\",\"content\":\"Can you hear me?\"}],\"steam\":false,\"temperature\":1.0}", "-HContent-Type: application/json", "http://localhost:8080/v1/chat/completions"
I meant in the gptel-log buffer. It looks like the above log is incomplete. Could you try again?
I just tried, but now it simply worked. I don't know what changed, I tried several different times before. I changed the how llamafile
-files are executed by default using mimeo
. Could that have something to do with it? Though the llm was running before I changed that, I think, so I'm not sure how it would. Anyway, I'll look and try to find out what changed.
So it seems like it's only the contact between the laptop and the desktop that doesn't work, if I do exactly same thing from the desktop itself, it works. I tried again and I think the resulting log is the same, but here it is nevertheless:
{
"gptel": "request headers",
"timestamp": "2024-03-11 21:48:18"
}
{
"Content-Type": "application/json"
}
{
"gptel": "request body",
"timestamp": "2024-03-11 21:48:18"
}
{
"model": "test",
"messages": [
{
"role": "system",
"content": "You are a large language model living in Emacs and a helpful assistant. Respond concisely."
},
{
"role": "user",
"content": "Can you hear me?"
}
],
"stream": false,
"temperature": 1.0
}
{
"gptel": "request Curl command",
"timestamp": "2024-03-11 21:48:18"
}
[
"curl",
"--disable",
"--location",
"--silent",
"--compressed",
"-XPOST",
"-y300",
"-Y1",
"-D-",
"-w(8866ba70f2e8a5a85ab4dc25c869e5a1 . %{size_header})",
"-d{\"model\":\"test\",\"messages\":[{\"role\":\"system\",\"content\":\"You are a large language model living in Emacs and a helpful assistant. Respond concisely.\"},{\"role\":\"user\",\"content\":\"Can you hear me?\"}],\"stream\":false,\"temperature\":1.0}",
"-HContent-Type: application/json",
"http://10.0.0.8:8080/v1/chat/completions"
]
What happens if you run that curl command manually?
curl --disable --location --silent --compressed -XPOST -y300 -Y1 -D- \
-w'(8866ba70f2e8a5a85ab4dc25c869e5a1 . %{size_header})' \
-d"{\"model\":\"test\",\"messages\":[{\"role\":\"system\",\"content\":\"You are a large language model living in Emacs and a helpful assistant. Respond concisely.\"},{\"role\":\"user\",\"content\":\"Can you hear me?\"}],\"stream\":false,\"temperature\":1.0}" -H"Content-Type: application/json" \
'http://10.0.0.8:8080/v1/chat/completions'
I get the following output:
(8866ba70f2e8a5a85ab4dc25c869e5a1 . 0)
I successfully contacted the model from the desktop immediately before that.
I get the following output:
This is a networking/connection issue, unrelated to gptel. I suggest checking if you can ping your desktop/laptop from the other device first.
@nameiwillforget Did you get gptel working as intended?
No. I couldn't isolate the error, so I put it off. Now I'm using ChatGPT, which works fine. I'll close the issue.