gpt4all
gpt4all copied to clipboard
[Refactor] Migrate to Nexa AI llama.cpp build
Feature Request
The Nomic build of llama.cpp is outdated (e.g #3523, #3537, #3540, etc.) and due to be replaced. This FR is an alternative to the proposed switch to Ollama (#3542), which I reckon would be about as welcome as an outhouse breeze.
Nexa AI have their own build of llama.cpp that is used in their Python SDK and supports a diverse range of hardware types. It may be preferable to migrate to this third-party build rather than persisting with the Ollama PR. The larger Nexa SDK package is likely to continue being updated regularly, and integrating the Nexa llama.cpp build would still be easier than continuing Nomic's own fork, resource constraints no doubt being a key consideration behind #3542!, even if some adaptation work is required.
The most important issue that exists at the Nexa end in terms of compatibility with GPT4All is the present lack of Vulkan builds for Linux (nexa-sdk#380). However, this is probably an oversight rather than an unfixed feature. There is also a separate branch for Qualcomm (arm64) NPUs here, although it may not yet be production ready.
Note: I am not a representative of Nexa AI
cc @Davidqian123
I think it could be nice to use Ollama sources to run LLM's as Ollama is well maintened.
I think it would be good also if we could plug Ollama to GPT4All as we can plug other IA services.
While it's not clear to me how stable Ollama really is, firstly with its ongoing setup challenges and more recently with its deprecation of llama.cpp, this FR relates to the default inference engine for GPT4All rather than adding secondary options. It could be very helpful to repurpose the code in #3542 for use in scenarios where Ollama works as intended, even while moving towards something like Nexa AI's llama.cpp build for primary inference.
If there is a way to setup ChatGPT as IA tool, it should be possible to use Ollama as its API are compatible with ChatGPT.