LLaVA-NeXT icon indicating copy to clipboard operation
LLaVA-NeXT copied to clipboard

LLaVA-Video精度测试对不齐

Open linxid opened this issue 1 year ago • 1 comments

我测试了以下的模型的精度,模型精度和论文中宣称的不同。是我跑的有什么问题吗。 评测精度如下所示: image 这是我评测脚本:

  • LLaVA-Video-7B-Qwen2, fps32:
/opt/conda/envs/python3.10/bin/python -m accelerate.commands.launch --num_processes=8 \
    -m lmms_eval \
    --model llava_vid \
    --model_args pretrained=/tmp/pre-trained/lmms-lab/LLaVA-Video-7B-Qwen2,conv_template=qwen_1_5,max_frames_num=32,mm_spatial_pool_mode=average \
    --tasks videomme \
    --batch_size 1 \
    --log_samples \
    --log_samples_suffix llava_vid \
    --output_path ./logs/ 
  • LLaVA-Video-7B-Qwen2, fps32:
/opt/conda/envs/python3.10/bin/python -m accelerate.commands.launch --num_processes=8 \
    -m lmms_eval \
    --model llava_vid \
    --model_args pretrained=/tmp/pre-trained/lmms-lab/LLaVA-Video-7B-Qwen2,conv_template=qwen_1_5,max_frames_num=32,mm_spatial_pool_mode=average \
    --tasks videomme \
    --batch_size 1 \
    --log_samples \
    --log_samples_suffix llava_vid \
    --output_path ./logs/ 
  • LLaVA-Video-7B-Qwen2-Video-Only,fp32:
/opt/conda/envs/python3.10/bin/python -m accelerate.commands.launch --num_processes=8 \
    -m lmms_eval \
    --model llava_vid \
    --model_args pretrained=/tmp/pre-trained/lmms-lab/LLaVA-Video-7B-Qwen2-Video-Only,conv_template=qwen_1_5,max_frames_num=32,mm_spatial_pool_mode=average \
    --tasks videomme \
    --batch_size 1 \
    --log_samples \
    --log_samples_suffix llava_vid \
    --output_path ./logs/ 
  • LLaVA-Video-7B-Qwen2-Video-Only,fp64:
/opt/conda/envs/python3.10/bin/python -m accelerate.commands.launch --num_processes=8 \
    -m lmms_eval \
    --model llava_vid \
    --model_args pretrained=/tmp/pre-trained/lmms-lab/LLaVA-Video-7B-Qwen2-Video-Only,conv_template=qwen_1_5,max_frames_num=64,mm_spatial_pool_mode=average \
    --tasks videomme \
    --batch_size 1 \
    --log_samples \
    --log_samples_suffix llava_vid \
    --output_path ./logs/ 

linxid avatar Dec 30 '24 14:12 linxid

看起来没什么问题,论文中llava-video-7B 的结果是:63.3 貌似比你复现的结果稍低一些, 我可以最近再跑一次。

ZhangYuanhan-AI avatar Feb 24 '25 12:02 ZhangYuanhan-AI