web-llm Is it possible to run on a 4GB memory GPU?

Is it possible to run on a 4GB memory GPU?

Open liketheflower opened this issue 2 years ago • 5 comments

I noticed that the readme mentioned that we need "6.4G GPU memory" to run the demo. However, my Mac Pro only has a 4GB memory, just wondering whether there is any approaches to run on a 4GB GPU memory PC? Thanks!

Apr 27 '23 04:04 liketheflower

Same issue, I got 4g dedicated GPU memory, and 16g shared, but inited failed. Any workaround? Thanks

Apr 28 '23 03:04 robbynie

It's coming soon. Our team is testing running vicuna within 4gb memory internally and will make it public soon

Apr 28 '23 03:04 jinhongyii

It's coming soon. Our team is testing running vicuna within 4gb memory internally and will make it public soon

Thanks. I thought it is hard and maybe impossible. Looking forward to the 4gb memory one. It will be super meaningful.

Apr 28 '23 13:04 liketheflower

Try out our latest project https://github.com/mlc-ai/mlc-llm. You can run a model within 4gb memory constraints in native runtime. We will support 4gb llm on web later

Apr 29 '23 06:04 jinhongyii

Try out our latest project https://github.com/mlc-ai/mlc-llm. You can run a model within 4gb memory constraints in native runtime. We will support 4gb llm on web later

Thanks. Since it does not need web browser and can be run native on terminal. I tried it on a linux server and the model works by following the instruction. Thanks!

Apr 29 '23 22:04 liketheflower

web-llm web-llm copied to clipboard

Is it possible to run on a 4GB memory GPU?

web-llm
web-llm copied to clipboard