MiniGPT-4
MiniGPT-4 copied to clipboard
Loading the model on multiple GPUs
I have two 4090 24GB, if possible please provide an extra argument to demo.py to either load the model on CPU or 2 or more GPU and another argument to run on 16-bit and take advantage of extra GPU RAM, instead of editing config files.
I also would like to know how to do this? I have 2x3060 12gb so I could load the 13b model but it doesn't seem to be implemented
I have same request.
I have same request too.
-
Set the parameter device_map='auto' when load the LlamaForCausalLM.from_pretrained()
-
Replace the line in demo.py as: chat = Chat(model, vis_processor, device='cuda')
It can run on two RTX 2080Ti in my computer.
Set the parameter device_map='auto' when load the LlamaForCausalLM.from_pretrained() Replace the line in demo.py as: chat = Chat(model, vis_processor, device='cuda') It can run on two RTX 2080Ti in my computer.
It seems the model is implemented in two devices. But when doing the inference, the tensor flowed in two deivces and it will throw the two devices error. RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!
Set the parameter device_map='auto' when load the LlamaForCausalLM.from_pretrained() Replace the line in demo.py as: chat = Chat(model, vis_processor, device='cuda') It can run on two RTX 2080Ti in my computer.
It seems the model is implemented in two devices. But when doing the inference, the tensor flowed in two deivces and it will throw the two devices error. RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!
(1) Load the LLaMA with device map to 'auto':
https://github.com/Vision-CAIR/MiniGPT-4/blob/22d8888ca2cf0aac862f537e7d22ef5830036808/minigpt4/models/mini_gpt4.py#L94
device_map = 'auto'
(2) Modify the line below from 'cuda:{}'.format(args.gpu_id)' to 'cuda', It will automatically assign to device0 or device1 if you have two devices:
https://github.com/Vision-CAIR/MiniGPT-4/blob/22d8888ca2cf0aac862f537e7d22ef5830036808/demo.py#L64
chat = Chat(model, vis_processor, device='cuda' )
(3) The "to device" can be removed from the line below because llama has been loaded to GPUs automatically:
https://github.com/Vision-CAIR/MiniGPT-4/blob/22d8888ca2cf0aac862f537e7d22ef5830036808/demo.py#L60
model = model_cls.from_config(model_config)
(4) When encode the image, we may encode the image with CPU and assign the image embedding to GPU
https://github.com/Vision-CAIR/MiniGPT-4/blob/22d8888ca2cf0aac862f537e7d22ef5830036808/minigpt4/conversation/conversation.py#L185 https://github.com/Vision-CAIR/MiniGPT-4/blob/22d8888ca2cf0aac862f537e7d22ef5830036808/minigpt4/conversation/conversation.py#L186
image_emb, _ = self.model.encode_img(image.to('cpu'))
img_list.append(image_emb.to('cuda'))
The model should now work if you have multiple GPUs with low memory space.
Traceback (most recent call last):
File "/home2/jainit/MiniGPT-4/demo.py", line 61, in TORCH_USE_CUDA_DSA to enable device-side assertions.
I did all of these steps but i still get
@JainitBITW Is it working now for you?
Yes i just restarted my cuda.
@JainitBITW Did you do anything apart from @thcheung 's instruction? Thanks anyway!
Nope exactly same
What error you are getting
I'm trying to run the 13 B model on multiple GPUs. The author has written they currently don't support multi-GPU inference. So , I want to be sure that it's possible to do inference on multiple GPUs before provisioning the ec2 instance.
I think you van go ahead
@JainitBITW @thcheung thanks it worked for me (8 bit). Have any idea how to do it for 16 bit (low resource = False) ? It is throwing this error: RuntimeError: Input type (float) and bias type (c10::Half) should be the same
RuntimeError: Input type (float) and bias type (c10::Half) should be the same
I got through this error by setting vit_precision: "fp32" in minigpt_v2.yaml, but I didn't figure out what would need to be done to get the new input to also be fp16 (half precision) instead of making everything fp32.
My solution is:
CUDA_VISIBLE_DEVICES=1 python demo_v2.py --cfg-path eval_configs/minigptv2_eval.yaml --gpu-id 0