LLaVA icon indicating copy to clipboard operation
LLaVA copied to clipboard

Can LLaVA-13b-delta-v0 be split into 900M parts?

Open AreChen opened this issue 2 years ago • 2 comments
trafficstars

I have 4 x 3080ti. According to the current bin file size(9G-9G-6G), it is impossible to load them into multiple graphics cards. If the model is segmented into 900M, it will provide more possibilities for running on multiple graphics cards! Or is there a way for me to split it myself?

Like this llama-13b-hf, the model has been segmented into 41 parts, each of which is 900M.

AreChen avatar Apr 23 '23 10:04 AreChen

Hi, thank you for your interest in our work.

This may have already been supported even with the current weight splits. I tried launch a worker with multiple-gpus parameter set to 4, and it seems to scatter the model weights over four GPUs. Can you try it following the instructions here: https://github.com/haotian-liu/LLaVA#launch-a-model-worker-multiple-gpus-when-gpu-vram--24gb

haotian-liu avatar Apr 24 '23 07:04 haotian-liu

Okay, I will try it later. Thank you!

AreChen avatar Apr 24 '23 07:04 AreChen

Hi, I am closing this issue, due to the inactivity. Hope your problem has been resolved. If you have further concerns, please feel free to re-open or open a new issue. Thanks!

haotian-liu avatar May 01 '23 04:05 haotian-liu