LLaVA
LLaVA copied to clipboard
Can LLaVA-13b-delta-v0 be split into 900M parts?
I have 4 x 3080ti. According to the current bin file size(9G-9G-6G), it is impossible to load them into multiple graphics cards. If the model is segmented into 900M, it will provide more possibilities for running on multiple graphics cards! Or is there a way for me to split it myself?
Like this llama-13b-hf, the model has been segmented into 41 parts, each of which is 900M.
Hi, thank you for your interest in our work.
This may have already been supported even with the current weight splits. I tried launch a worker with multiple-gpus parameter set to 4, and it seems to scatter the model weights over four GPUs. Can you try it following the instructions here: https://github.com/haotian-liu/LLaVA#launch-a-model-worker-multiple-gpus-when-gpu-vram--24gb
Okay, I will try it later. Thank you!
Hi, I am closing this issue, due to the inactivity. Hope your problem has been resolved. If you have further concerns, please feel free to re-open or open a new issue. Thanks!