Video-LLaMA icon indicating copy to clipboard operation
Video-LLaMA copied to clipboard

inferece如何使用多张V100代替一张A100?

Open flying2023 opened this issue 1 year ago • 4 comments

你好,如果只有V100机器,加载llama13B版本,会OOM,但有多张V100,如何实现类似automap的功能,将模型映射到多张v100GPU上?

flying2023 avatar Sep 27 '23 16:09 flying2023

LlamaForCausalLM.from_pretrained的参数device_map改成auto就行了 可能初始化的方式要小改一些

xmy0916 avatar Oct 17 '23 07:10 xmy0916

if ckpt_path: print("Load first Checkpoint: {}".format(ckpt_path)) ckpt = torch.load(ckpt_path, map_location="cpu") msg = model.load_state_dict(ckpt['model'], strict=False) ckpt_path_2 = cfg.get("ckpt_2", "")
if ckpt_path_2: print("Load second Checkpoint: {}".format(ckpt_path_2)) ckpt = torch.load(ckpt_path_2, map_location="cpu") msg = model.load_state_dict(ckpt['model'], strict=False)

LlamaForCausalLM.from_pretrained的参数device_map改成auto后,上边load_state_dict过程,依旧load到一张卡导致OOM吧?

flying2023 avatar Oct 18 '23 03:10 flying2023

有人这做成功了吗?24GB的GPU都跑不起来。

james-hu avatar Jan 15 '24 11:01 james-hu

我最后用的low_resource: True。这样可以在24GB单GPU上跑13b

james-hu avatar Jan 21 '24 04:01 james-hu