LLaMA-Factory Confused about the llama-pro demo. Why `num_layers` 49 should be divisible by `num_layer

Confused about the llama-pro demo. Why `num_layers` 49 should be divisible by `num_layer_trainable` 2.

Open hzgdeerHo opened this issue 9 months ago • 0 comments

Reminder

[X] I have read the README and searched the existing issues.

Reproduction

By Using the LLama-pro example script to finetune the 01-ai/Yi-1.5-9B-Chat model:

Modified expand.sh: python scripts/llama_pro.py
--model_name_or_path 01-ai/Yi-1.5-9B-Chat
--output_dir models/01-ai/Yi-1.5-9B-Chat
--num_expand 2 Modified /examples/extras/llama_pro/llama3_freeze_sft.yaml:

model

model_name_or_path: models/01-ai/Yi-1.5-9B-Chat

method

stage: sft do_train: true finetuning_type: freeze freeze_trainable_layers: 2 freeze_trainable_modules: all use_llama_pro: true

dataset

dataset: identity template: yi cutoff_len: 1024 max_samples: 1000 overwrite_cache: true preprocessing_num_workers: 16

output

output_dir: saves/Yi-1.5-9B-Chat/freeze/sft logging_steps: 1 save_steps: 500 plot_loss: true overwrite_output_dir: true

train

per_device_train_batch_size: 1 gradient_accumulation_steps: 8 learning_rate: 0.00005 num_train_epochs: 2 lr_scheduler_type: cosine warmup_steps: 0.1 fp16: true

eval

val_size: 0.1 per_device_eval_batch_size: 1 evaluation_strategy: steps eval_steps: 500

And I got the error message:

Traceback (most recent call last): File "/home/ubuntu/python3.9/bin/llamafactory-cli", line 8, in sys.exit(main()) File "/home/ubuntu/LLaMA-Factory/src/llamafactory/cli.py", line 65, in main run_exp() File "/home/ubuntu/LLaMA-Factory/src/llamafactory/train/tuner.py", line 34, in run_exp run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/home/ubuntu/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 34, in run_sft model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train) File "/home/ubuntu/LLaMA-Factory/src/llamafactory/model/loader.py", line 144, in load_model model = init_adapter(config, model, model_args, finetuning_args, is_trainable) File "/home/ubuntu/LLaMA-Factory/src/llamafactory/model/adapter.py", line 73, in init_adapter raise ValueError( ValueError: num_layers 49 should be divisible by num_layer_trainable 2.

Expected behavior

I will work normally as we targeted to finetune the expanded 2 blocks.

System Info

No response

Others

No response

May 19 '24 13:05 hzgdeerHo

LLaMA-Factory LLaMA-Factory copied to clipboard

Confused about the llama-pro demo. Why `num_layers` 49 should be divisible by `num_layer_trainable` 2.

Reminder

Reproduction

model

method

dataset

output

train

eval

Expected behavior

System Info

Others

LLaMA-Factory
LLaMA-Factory copied to clipboard