VLMEvalKit
VLMEvalKit copied to clipboard
Does vlmeval support multi card inference and batch size > 1?
Does vlmeval support multi card inference and batch size > 1?
Hi, @John-Ge ,
- For simplicity reasons, VLMEvalKit do not support batch size > 1 inference for now.
- VLMEvalKit currently supports two types of multi-GPU inference: 1). DistributedDataParallel via torchrun, which run N VLM instances on N GPUs. It requires your VLM to be small enough and can run on a single GPU. 2). The model is configured by default to use multiple GPUs (like IDEFICS_80B_INSTRUCT). When you launch with python, it will automatically run on all available GPUs.
Thanks for your relpy! I would like to know what is the normal format of inference with batch size > 1? Should we deploy the model though like, vllm or tgi? Do we need to wait for them to support llava?
The authors of LLaVA have tried to create the beta-version of batch inference https://github.com/haotian-liu/LLaVA/issues/754
Hi, @darkpromise98 , we will try to include this feature into VLMEvalKit recently.
Hi, @darkpromise98 , we will try to include this feature into VLMEvalKit recently.
That's great !
https://github.com/haotian-liu/LLaVA/issues/754#issuecomment-1907970439 this issue build a fast inference method for llava, would you add this function for every benchmark in this repo?
BTW, I find sglang may not support lora+base model. I train llava with lora. If possible, I hope you could support load base model and merge lora weights and deploy it for evaluation.
Hi, @John-Ge @darkpromise98 , I have reviewed the request. I'm sorry that I may not implement this feature on my own for the following reasons:
- Currently, only few VLMs supports the
batch_inference
interface, adding it for LLaVA may lead to some major changes in the inference pipeline of VLMEvalKit. - The inference of LLaVA is relatively fast: under
batch_size=1
, llava-v1.5-13b can run at 3~4 fps on a single A100. Thus I thinkbatch_inference
for LLaVA may not be a critical feature for VLMEvalKit.
BTW, I'm willing to review and merge it VLMEvalKit main branch if someone is willing to create a PR (might be relatively heavy) about it.