gpt4all
gpt4all copied to clipboard
[Feature] GPU support in docker-based API server
System Info
GTX1060, Win10, GPT4All Falcon
Information
- [ ] The official example notebooks/scripts
- [ ] My own modified scripts
Reproduction
- run GPT4All
- Use GPT4All built in chat
- Use GPT4All API
- see that built in chat usesGPU (as selected in settings), API only uses CPU
Expected behavior
Use the same source with API that built-in chat uses
Getting inspiration from the Python module, I simply added "device": "gpu" to the JSON-HTTP call performed by CURL and gpt4all is using the GPU!
Full example:
url http://localhost:4891/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "Nous Hermes 2 Mistral DPO","messages": [{"role": "user", "content": "help me"}], "temperature": 0.7,"device":"gpu" }'
So this ticket is more about documentation than a missing feature.
Getting inspiration from the Python module, I simply added
"device": "gpu"to the JSON-HTTP call performed by CURL and gpt4all is using the GPU!
You're using the docker-based gpt4all-api server? I doubt that actually works: https://github.com/nomic-ai/gpt4all/blob/97de30edd100071ef70372f38e85959cac6378a3/gpt4all-api/gpt4all_api/app/api_v1/routes/chat.py#L50-L53
And then it proceeds to call GPT4All without ever passing the device argument: https://github.com/nomic-ai/gpt4all/blob/97de30edd100071ef70372f38e85959cac6378a3/gpt4all-api/gpt4all_api/app/api_v1/routes/chat.py#L62
No, I'm using the Ubuntu installer. When testing the behaviour empirically, it works:
No, I'm using the Ubuntu installer. When testing the behaviour empirically, it works:
The OP reports that GPT4All Chat's built-in local server uses the GPU when one is selected in settings. Their request is to also add GPU support to the standalone docker-based API server. Your findings seem to agree with theirs.
gpt4all-api has been removed, see #2314.