gpt4all
gpt4all copied to clipboard
Ability to run the prompt through multiple models
Feature request
It would be great if I could run the same prompt through multiple models. I understand this would be quite slow on consumer hardware as the models would need to be loaded/unloaded, but that's okay.
Motivation
So I can compare the output across many models to see which one is best fit for a particular task. Right now, it's a manual effort to change the models and paste the same prompt into all of them.
Your contribution
N/A
If I understand correctly it would be something like side-by-side Chatbot Arena (https://chat.lmsys.org/?arena).
Trouble is, running two models simultaneously would probably require at least 32 Gb of RAM. It would be nice to have, nevertheless.
Indeed, so it would not be simultaneous but consecutive - load one model, run the prompt, unload, load another, repeat. It would be understandably slow, but still quicker than doing the same thing manually.
This would be very useful, as many/most users of gpt4all will be continually trying out multiple models to find out which works best for their application/questions. A consecutive approach would work well - if the end users computer is powerful and has a lot of memory, the process will move fast. For users with less powerful computers, series responses will take more time but still way faster than switching back and forth (and losing chat history each time).
Loading both at the same time would require a LOT of memory. Most people could load at max. 2 7B models on the same machine, if q4.
Please observe that I don't ask for models to be loaded at the same time - sequential loading is fine, with the speed tradeoff that would incur. This is mentioned in both comments https://github.com/nomic-ai/gpt4all/issues/759#issue-1731555273 and https://github.com/nomic-ai/gpt4all/issues/759#issuecomment-1571972816.