infer on multi 4090 cards

Open hyhzl opened this issue 1 year ago • 1 comments

i only have 4 x 4090 cards, but i can not run internvl-v1-5, which the program will run model on the same card, resulting in OOM. how could i run model on cards rather than on only one card @czczup

Apr 28 '24 06:04 hyhzl

you can sea example of multi gpu for inference from https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5.

# Otherwise, you need to set device_map='auto' to use multiple GPUs for inference.
model = AutoModel.from_pretrained(
     path,
     torch_dtype=torch.bfloat16,
     low_cpu_mem_usage=True,
     trust_remote_code=True,
     device_map='auto').eval()

Apr 29 '24 03:04 NiYueLiuFeng