LLaVA-NeXT icon indicating copy to clipboard operation
LLaVA-NeXT copied to clipboard

dpo_ov7b.sh训练问题

Open zhanghang-official opened this issue 1 year ago • 6 comments

1、train_dpo.py需要from data_processing.utils import load_jsonl, load_json,缺失data_processing文件 2、modality_lengths函数中要计算answer字段的长度,dpo数据集构造中没有answer字段 3、A800 80G显卡训练显存不够,如何优化 命令如下: torchrun --nproc_per_node=8
llava/train/train_dpo.py
--deepspeed scripts/zero3.json
--model_name_or_path=${SFT_MODEL}
--dpo_alpha=1.0
--beta=${beta}
--gamma=0
--version $PROMPT_VERSION
--data_path=$DATA_PATH
--image_folder /raid/zhanghang02/llava_ov/images
--video_folder /raid/zhanghang02/llava_ov/videos
--mm_tunable_parts="mm_vision_tower,mm_mlp_adapter,mm_language_model"
--unfreeze_mm_vision_tower True
--vision_tower ${VISION_MODEL_VERSION}
--mm_projector_type mlp2x_gelu
--mm_vision_select_layer -2
--mm_use_im_start_end False
--mm_use_im_patch_token False
--group_by_modality_length True
--image_aspect_ratio anyres_max_9
--image_grid_pinpoints "(1x1),...,(6x6)"
--mm_patch_merge_type spatial_unpad
--bf16 True
--run_name $DPO_CLEAN_NAME
--output_dir $OUTPUT_DIR
--num_train_epochs $EPOCH
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--gradient_accumulation_steps 16
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 1000
--save_total_limit 1
--learning_rate 5e-7
--weight_decay 0.
--warmup_ratio 0.1
--lr_scheduler_type "cosine"
--logging_steps 1
--tf32 True
--model_max_length 32768
--gradient_checkpointing True
--dataloader_num_workers 4
--lazy_preprocess True
--report_to wandb
--dataloader_drop_last True

zhanghang-official avatar Nov 05 '24 08:11 zhanghang-official

OOM我也遇到了,现在你解决了吗?

NOBOSO avatar Nov 27 '24 08:11 NOBOSO

请问‘’train_dpo.py需要from data_processing.utils import load_jsonl, load_json,缺失data_processing文件’‘这个问题是如何解决的

Liuziyu77 avatar Dec 30 '24 10:12 Liuziyu77

haha, 发现问题一模一样

Liuziyu77 avatar Dec 30 '24 12:12 Liuziyu77

同问,求教

LWQuestc avatar Mar 06 '25 09:03 LWQuestc

CUDA out of memory +1

zhang123434 avatar Jul 01 '25 08:07 zhang123434

用llamafactory吧,测试多模态dpo没有问题

张志鸿(zhang zhihong) @.***> 于2025年7月1日周二 16:21写道:

zhang123434 left a comment (LLaVA-VL/LLaVA-NeXT#333) https://github.com/LLaVA-VL/LLaVA-NeXT/issues/333#issuecomment-3022559450

CUDA out of memory +1

— Reply to this email directly, view it on GitHub https://github.com/LLaVA-VL/LLaVA-NeXT/issues/333#issuecomment-3022559450, or unsubscribe https://github.com/notifications/unsubscribe-auth/A44YOSYWJFNTA36RNLBGMFT3GJAH5AVCNFSM6AAAAABRGAIDRKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTAMRSGU2TSNBVGA . You are receiving this because you authored the thread.Message ID: @.***>

zhanghang-official avatar Jul 04 '25 02:07 zhanghang-official