gpt4all
gpt4all copied to clipboard
GPT4All Responsive time
Hello everyone,
I installed GPT4All with the gpt4all-installer-win64.exe.
As models I have ggml-vicuna-7b-1.1-q4_2.bin and ggml-vicuna-13b-1.1-q4_2.bin taken.
To test I try "Write a poem about a large language model that runs on my laptop"
The 7b takes 12 seconds before the first letter of the answer appears. The 13b takes 26 seconds before the first letter of the answer appears.
Are these times normal for such a system? How can I shorten response times? WIN 10 Ryzen 3600 (6C/12T, 3.60-4.20GHz) 32GB Nvidia GTX 1070 8GB
Best regards Hans
I can agree with that. I have similar responsive time (or maybe even worse). I think since the last update possibly.
Debian 12 Intel Core i7 6700HQ GeForce 980m 16 GB RAM
An no matter what model I am using I have this very bad responsive time.
Try groovy bin model.
On i76700, 16 gb ram, gtx 1060 6gb (Seeing around 5 to 8 seconds before responses) with gpt4all win gui distribution
Hi g0dEngineer,
you're right, groovy answers much faster. But unfortunately he does not answer in German. If I ask him a question in German, he answers in English. When I tell him to answer in German he crashes. This makes vicuna much better, also in terms of the quality of the answers (in German and English), vicuna is clearly ahead of groovy. So, no alternative. Nevertheless, thank you very much for your advice.
Many greetings Hans
Author
Sorry you can't get groovy to answer in German Looks like a trade off of using vicuana will be more time to respond due to increased capabilities/parameters
Try experimenting with the cpu threads option. For me 4 threads is fastest and 5+ begins to slow down.
Those times I mentioned are times reported after I had set thread usage to 4 in the i76700 cpu for the c++ ui.
I did not think about setting the thread for the python implementation though! Thanks
On Wed, May 24, 2023, 10:23 AM Taylor Calderone @.***> wrote:
That seems normal imo. But try experimenting with the cpu threads option. For me 4 threads is fastest and 5+ begins to slow down.
— Reply to this email directly, view it on GitHub https://github.com/nomic-ai/gpt4all/issues/625#issuecomment-1561363825, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATR7YUDRC35OKB3Y5BVALWDXHYRV3ANCNFSM6AAAAAAYGQ63SI . You are receiving this because you commented.Message ID: @.***>
Try experimenting with the cpu threads option. For me 4 threads is fastest and 5+ begins to slow down.
Hi spacecowgoesmoo, thanks for the tip.
For me, 12 threads is the fastest. However, the difference is only in the very small single-digit percentage range, which is a pity.
Maybe the Wizard Vicuna model will bring a noticeable performance boost.
This seems answered!
Please always feel free to open more issues if you have anything else.