InternVL
InternVL copied to clipboard
不使用lmdeploy和swift应该如何进行多图推理
使用的代码是[https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5] 中给的代码,并在多张v100上运行InternVL-Chat-V1-5
path = "./InternVL-Chat-V1-5"
model = AutoModel.from_pretrained(
path,
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True,
trust_remote_code=True,
device_map='auto').eval()
tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
pixel_values = load_image('xxx.jpg', max_num=6).to(torch.bfloat16).cuda()
generation_config = dict(
num_beams=1,
max_new_tokens=512,
do_sample=False,
)
# single-round single-image conversation
question = "describe this image"
response = model.chat(tokenizer, pixel_values, question, generation_config)
我希望能和[https://lmdeploy.readthedocs.io/zh-cn/latest/inference/vl_pipeline.html#id5]一样进行类似的多图推理,但lmdeploy不支持dp,我应该如何进行多图推理?
the same question
dp 是啥意思
dp 是啥意思
data parallel
现在可以按照这个文档使用transformers进行多图推理:
https://internvl.readthedocs.io/en/latest/internvl2.0/quick_start.html#inference-with-transformers