dpo_ov7b.sh训练问题
1、train_dpo.py需要from data_processing.utils import load_jsonl, load_json,缺失data_processing文件
2、modality_lengths函数中要计算answer字段的长度,dpo数据集构造中没有answer字段
3、A800 80G显卡训练显存不够,如何优化
命令如下:
torchrun --nproc_per_node=8
llava/train/train_dpo.py
--deepspeed scripts/zero3.json
--model_name_or_path=${SFT_MODEL}
--dpo_alpha=1.0
--beta=${beta}
--gamma=0
--version $PROMPT_VERSION
--data_path=$DATA_PATH
--image_folder /raid/zhanghang02/llava_ov/images
--video_folder /raid/zhanghang02/llava_ov/videos
--mm_tunable_parts="mm_vision_tower,mm_mlp_adapter,mm_language_model"
--unfreeze_mm_vision_tower True
--vision_tower ${VISION_MODEL_VERSION}
--mm_projector_type mlp2x_gelu
--mm_vision_select_layer -2
--mm_use_im_start_end False
--mm_use_im_patch_token False
--group_by_modality_length True
--image_aspect_ratio anyres_max_9
--image_grid_pinpoints "(1x1),...,(6x6)"
--mm_patch_merge_type spatial_unpad
--bf16 True
--run_name $DPO_CLEAN_NAME
--output_dir $OUTPUT_DIR
--num_train_epochs $EPOCH
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--gradient_accumulation_steps 16
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 1000
--save_total_limit 1
--learning_rate 5e-7
--weight_decay 0.
--warmup_ratio 0.1
--lr_scheduler_type "cosine"
--logging_steps 1
--tf32 True
--model_max_length 32768
--gradient_checkpointing True
--dataloader_num_workers 4
--lazy_preprocess True
--report_to wandb
--dataloader_drop_last True
OOM我也遇到了,现在你解决了吗?
请问‘’train_dpo.py需要from data_processing.utils import load_jsonl, load_json,缺失data_processing文件’‘这个问题是如何解决的
haha, 发现问题一模一样
同问,求教
CUDA out of memory +1
用llamafactory吧,测试多模态dpo没有问题
张志鸿(zhang zhihong) @.***> 于2025年7月1日周二 16:21写道:
zhang123434 left a comment (LLaVA-VL/LLaVA-NeXT#333) https://github.com/LLaVA-VL/LLaVA-NeXT/issues/333#issuecomment-3022559450
CUDA out of memory +1
— Reply to this email directly, view it on GitHub https://github.com/LLaVA-VL/LLaVA-NeXT/issues/333#issuecomment-3022559450, or unsubscribe https://github.com/notifications/unsubscribe-auth/A44YOSYWJFNTA36RNLBGMFT3GJAH5AVCNFSM6AAAAABRGAIDRKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTAMRSGU2TSNBVGA . You are receiving this because you authored the thread.Message ID: @.***>