gpt4all icon indicating copy to clipboard operation
gpt4all copied to clipboard

GPT4All Responsive time

Open hansnolte opened this issue 2 years ago • 7 comments

Hello everyone,

I installed GPT4All with the gpt4all-installer-win64.exe.

As models I have ggml-vicuna-7b-1.1-q4_2.bin and ggml-vicuna-13b-1.1-q4_2.bin taken.

To test I try "Write a poem about a large language model that runs on my laptop"

The 7b takes 12 seconds before the first letter of the answer appears. The 13b takes 26 seconds before the first letter of the answer appears.

Are these times normal for such a system? How can I shorten response times? WIN 10 Ryzen 3600 (6C/12T, 3.60-4.20GHz) 32GB Nvidia GTX 1070 8GB

Best regards Hans

hansnolte avatar May 18 '23 14:05 hansnolte

I can agree with that. I have similar responsive time (or maybe even worse). I think since the last update possibly.

Debian 12 Intel Core i7 6700HQ GeForce 980m 16 GB RAM

An no matter what model I am using I have this very bad responsive time.

KiBLS avatar May 18 '23 14:05 KiBLS

Try groovy bin model.

On i76700, 16 gb ram, gtx 1060 6gb (Seeing around 5 to 8 seconds before responses) with gpt4all win gui distribution

iCosmosNeuroverse avatar May 18 '23 15:05 iCosmosNeuroverse

Hi g0dEngineer,

you're right, groovy answers much faster. But unfortunately he does not answer in German. If I ask him a question in German, he answers in English. When I tell him to answer in German he crashes. This makes vicuna much better, also in terms of the quality of the answers (in German and English), vicuna is clearly ahead of groovy. So, no alternative. Nevertheless, thank you very much for your advice.

Many greetings Hans

hansnolte avatar May 19 '23 07:05 hansnolte

Author

Sorry you can't get groovy to answer in German Looks like a trade off of using vicuana will be more time to respond due to increased capabilities/parameters

iCosmosNeuroverse avatar May 24 '23 02:05 iCosmosNeuroverse

Try experimenting with the cpu threads option. For me 4 threads is fastest and 5+ begins to slow down.

spacecowgoesmoo avatar May 24 '23 15:05 spacecowgoesmoo

Those times I mentioned are times reported after I had set thread usage to 4 in the i76700 cpu for the c++ ui.

I did not think about setting the thread for the python implementation though! Thanks

On Wed, May 24, 2023, 10:23 AM Taylor Calderone @.***> wrote:

That seems normal imo. But try experimenting with the cpu threads option. For me 4 threads is fastest and 5+ begins to slow down.

— Reply to this email directly, view it on GitHub https://github.com/nomic-ai/gpt4all/issues/625#issuecomment-1561363825, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATR7YUDRC35OKB3Y5BVALWDXHYRV3ANCNFSM6AAAAAAYGQ63SI . You are receiving this because you commented.Message ID: @.***>

iCosmosNeuroverse avatar May 24 '23 15:05 iCosmosNeuroverse

Try experimenting with the cpu threads option. For me 4 threads is fastest and 5+ begins to slow down.

Hi spacecowgoesmoo, thanks for the tip.

For me, 12 threads is the fastest. However, the difference is only in the very small single-digit percentage range, which is a pity.

Maybe the Wizard Vicuna model will bring a noticeable performance boost.

hansnolte avatar May 25 '23 07:05 hansnolte

This seems answered!

Please always feel free to open more issues if you have anything else.

niansa avatar Aug 15 '23 11:08 niansa