LLaVA-NeXT dpo_ov7b.sh训练问题

1、train_dpo.py需要from data_processing.utils import load_jsonl, load_json，缺失data_processing文件 2、modality_lengths函数中要计算answer字段的长度，dpo数据集构造中没有answer字段 3、A800 80G显卡训练显存不够，如何优化命令如下： torchrun --nproc_per_node=8
llava/train/train_dpo.py
--deepspeed scripts/zero3.json
--model_name_or_path=${SFT_MODEL}
--dpo_alpha=1.0
--beta=${beta}
--gamma=0
--version $PROMPT_VERSION
--data_path=$DATA_PATH
--image_folder /raid/zhanghang02/llava_ov/images
--video_folder /raid/zhanghang02/llava_ov/videos
--mm_tunable_parts="mm_vision_tower,mm_mlp_adapter,mm_language_model"
--unfreeze_mm_vision_tower True
--vision_tower ${VISION_MODEL_VERSION}
--mm_projector_type mlp2x_gelu
--mm_vision_select_layer -2
--mm_use_im_start_end False
--mm_use_im_patch_token False
--group_by_modality_length True
--image_aspect_ratio anyres_max_9
--image_grid_pinpoints "(1x1),...,(6x6)"
--mm_patch_merge_type spatial_unpad
--bf16 True
--run_name $DPO_CLEAN_NAME
--output_dir $OUTPUT_DIR
--num_train_epochs $EPOCH
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--gradient_accumulation_steps 16
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 1000
--save_total_limit 1
--learning_rate 5e-7
--weight_decay 0.
--warmup_ratio 0.1
--lr_scheduler_type "cosine"
--logging_steps 1
--tf32 True
--model_max_length 32768
--gradient_checkpointing True
--dataloader_num_workers 4
--lazy_preprocess True
--report_to wandb
--dataloader_drop_last True

Nov 05 '24 08:11 zhanghang-official

OOM我也遇到了，现在你解决了吗？

Nov 27 '24 08:11 NOBOSO

请问‘’train_dpo.py需要from data_processing.utils import load_jsonl, load_json，缺失data_processing文件’‘这个问题是如何解决的

Dec 30 '24 10:12 Liuziyu77

haha, 发现问题一模一样

Dec 30 '24 12:12 Liuziyu77

同问，求教

Mar 06 '25 09:03 LWQuestc

CUDA out of memory +1

Jul 01 '25 08:07 zhang123434

用llamafactory吧，测试多模态dpo没有问题

张志鸿(zhang zhihong) @.***> 于2025年7月1日周二 16:21写道：

zhang123434 left a comment (LLaVA-VL/LLaVA-NeXT#333) https://github.com/LLaVA-VL/LLaVA-NeXT/issues/333#issuecomment-3022559450

CUDA out of memory +1

— Reply to this email directly, view it on GitHub https://github.com/LLaVA-VL/LLaVA-NeXT/issues/333#issuecomment-3022559450, or unsubscribe https://github.com/notifications/unsubscribe-auth/A44YOSYWJFNTA36RNLBGMFT3GJAH5AVCNFSM6AAAAABRGAIDRKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTAMRSGU2TSNBVGA . You are receiving this because you authored the thread.Message ID: @.***>

Jul 04 '25 02:07 zhanghang-official