mlc-llm Multi-GPU support for larger-than-VRAM models

Multi-GPU support for larger-than-VRAM models

Open emvw7yf opened this issue 1 year ago • 1 comments

Awesome project, thanks!

Does it support sharding large models across multiple GPUs, or would this be in scope for this project in the future?

Apr 30 '23 15:04 emvw7yf

thank you for your suggestion, yes we love to support the needs of the community and bring it in our roadmap

Apr 30 '23 16:04 tqchen

llama.cpp seems to support this now fyi!

Aug 28 '23 21:08 earonesty

Multi gpu now lands in mlc

Oct 24 '23 13:10 tqchen