gpt4all
gpt4all copied to clipboard
HTTP API server returns empty result all the time
System Info
Windows 10 and Windows 11
GPT4All 2.4.19
Only using the Chat UI.
Information
- [ ] The official example notebooks/scripts
- [ ] My own modified scripts
Related Components
- [ ] backend
- [ ] bindings
- [ ] python-bindings
- [X] chat-ui
- [ ] models
- [ ] circleci
- [ ] docker
- [X] api
Reproduction
-
I installed Chat UI on three different machines. Loaded the Wizard 1.1 and the GPT4All Falcon models. When in the UI, everything behaves as expected.
-
I enabled the API web server in the settings.
-
When requesting using CURL, the request is accepted, but the result is always empty. Looking a little bit deeper, reveals a 404 result code.
This happens with every model and indpendent of the prompt. I checked the local firewall and also included chat.exe as allowed app, but no difference.
Looks like a systematic problem on my side, but I have no clue.
Expected behavior
See above
@SeriousOldMan
Would you like to share the input for your curl cmd?
@yhyu13
For example this one:
curl -X POST -H "Content-Type: application/json" -H "Authorization: Nothing to see here" -d "{"model": "ggml-v3-13b-hermes-q5_1.bin", "prompt": "is this working"}" http://localhost:4891/v1
The connection is established, but the answer is always empty. Result code is 404.
Also tried:
- "prompt": "### Human: Who are you? \n### Assistant:"
- Different names for the model (with or without ".bin") and also different models.
gpt4all_api seems to run ok:
gpt4all_api | Checking for script in /app/prestart.sh
gpt4all_api | There is no script /app/prestart.sh
gpt4all_api | INFO: Will watch for changes in these directories: ['/app']
gpt4all_api | WARNING: "workers" flag is ignored when reloading is enabled.
gpt4all_api | INFO: Uvicorn running on http://0.0.0.0:4891 (Press CTRL+C to quit)
gpt4all_api | INFO: Started reloader process [1] using WatchFiles
gpt4all_api | INFO: Started server process [7]
gpt4all_api | INFO: Waiting for application startup.
gpt4all_api | [2023-10-15 18:11:29,463 7:MainThread] api_v1.events - INFO -
gpt4all_api | Starting up GPT4All API
gpt4all_api | | events.py:22
gpt4all_api | [2023-10-15 18:11:29,463 7:MainThread] main - INFO - Downloading/fetching model: /models/ggml-mpt-7b-chat.bin | main.py:37
gpt4all_api | GGML_ASSERT: /home/circleci/project/gpt4all-backend/llama.cpp/ggml.c:4411: ctx->mem_buffer != NULL
But always respond empty answers, and nothing pops up in the logs, like it never got the requests.
gpt4all_api seems to run ok:
gpt4all_api | Checking for script in /app/prestart.sh gpt4all_api | There is no script /app/prestart.sh gpt4all_api | INFO: Will watch for changes in these directories: ['/app'] gpt4all_api | WARNING: "workers" flag is ignored when reloading is enabled. gpt4all_api | INFO: Uvicorn running on http://0.0.0.0:4891 (Press CTRL+C to quit) gpt4all_api | INFO: Started reloader process [1] using WatchFiles gpt4all_api | INFO: Started server process [7] gpt4all_api | INFO: Waiting for application startup. gpt4all_api | [2023-10-15 18:11:29,463 7:MainThread] api_v1.events - INFO - gpt4all_api | Starting up GPT4All API gpt4all_api | | events.py:22 gpt4all_api | [2023-10-15 18:11:29,463 7:MainThread] main - INFO - Downloading/fetching model: /models/ggml-mpt-7b-chat.bin | main.py:37 gpt4all_api | GGML_ASSERT: /home/circleci/project/gpt4all-backend/llama.cpp/ggml.c:4411: ctx->mem_buffer != NULL
But always respond empty answers, and nothing pops up in the logs, like it never got the requests.
Yes, exactly. And the result code is a 404.
Tested with new version 2.5. Same behaviour.