LLaMA-Factory icon indicating copy to clipboard operation
LLaMA-Factory copied to clipboard

Qwen1.5-MOE-A2.7B训练问题

Open qazzombie opened this issue 3 months ago • 1 comments

Reminder

  • [X] I have read the README and searched the existing issues.

Reproduction

#!/bin/bash export USE_MODELSCOPE_HUB=1 deepspeed --num_gpus 1 ../../src/train_bash.py
--deepspeed ../deepspeed/ds_z3_config.json
--stage sft
--do_train
--model_name_or_path Qwen/Qwen1.5-MoE-A2.7B-Chat
--dataset our_alpaca
--dataset_dir ../../data
--template default
--finetuning_type full
--output_dir ../../output/case1
--overwrite_cache
--overwrite_output_dir
--cutoff_len 1024
--preprocessing_num_workers 16
--per_device_train_batch_size 2
--per_device_eval_batch_size 1
--gradient_accumulation_steps 4
--lr_scheduler_type cosine
--logging_steps 10
--warmup_steps 20
--save_steps 2000
--eval_steps 100
--evaluation_strategy steps
--learning_rate 5e-5
--num_train_epochs 6.0
--max_samples 3000
--val_size 0.1
--ddp_timeout 180000000
--plot_loss
--fp16 True

Expected behavior

我的transformers已经是4.39.3版本了

System Info

ValueError: The checkpoint you are trying to load has model type qwen2_moe but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

Others

No response

qazzombie avatar Apr 12 '24 18:04 qazzombie

你需要使用预览版的transformers 请使用指令pip install git+https://github.com/huggingface/transformers

marko1616 avatar Apr 12 '24 18:04 marko1616