web-llm icon indicating copy to clipboard operation
web-llm copied to clipboard

Is it possible to run on a 4GB memory GPU?

Open liketheflower opened this issue 2 years ago • 5 comments

I noticed that the readme mentioned that we need "6.4G GPU memory" to run the demo. However, my Mac Pro only has a 4GB memory, just wondering whether there is any approaches to run on a 4GB GPU memory PC? Thanks!

liketheflower avatar Apr 27 '23 04:04 liketheflower

Same issue, I got 4g dedicated GPU memory, and 16g shared, but inited failed. Any workaround? Thanks

robbynie avatar Apr 28 '23 03:04 robbynie

It's coming soon. Our team is testing running vicuna within 4gb memory internally and will make it public soon

jinhongyii avatar Apr 28 '23 03:04 jinhongyii

It's coming soon. Our team is testing running vicuna within 4gb memory internally and will make it public soon

Thanks. I thought it is hard and maybe impossible. Looking forward to the 4gb memory one. It will be super meaningful.

liketheflower avatar Apr 28 '23 13:04 liketheflower

Try out our latest project https://github.com/mlc-ai/mlc-llm. You can run a model within 4gb memory constraints in native runtime. We will support 4gb llm on web later

jinhongyii avatar Apr 29 '23 06:04 jinhongyii

Try out our latest project https://github.com/mlc-ai/mlc-llm. You can run a model within 4gb memory constraints in native runtime. We will support 4gb llm on web later

Thanks. Since it does not need web browser and can be run native on terminal. I tried it on a linux server and the model works by following the instruction. Thanks!

liketheflower avatar Apr 29 '23 22:04 liketheflower