LLaMA-Factory
LLaMA-Factory copied to clipboard
Confused about the llama-pro demo. Why `num_layers` 49 should be divisible by `num_layer_trainable` 2.
Reminder
- [X] I have read the README and searched the existing issues.
Reproduction
By Using the LLama-pro example script to finetune the 01-ai/Yi-1.5-9B-Chat model:
Modified expand.sh:
python scripts/llama_pro.py
--model_name_or_path 01-ai/Yi-1.5-9B-Chat
--output_dir models/01-ai/Yi-1.5-9B-Chat
--num_expand 2
Modified /examples/extras/llama_pro/llama3_freeze_sft.yaml:
model
model_name_or_path: models/01-ai/Yi-1.5-9B-Chat
method
stage: sft do_train: true finetuning_type: freeze freeze_trainable_layers: 2 freeze_trainable_modules: all use_llama_pro: true
dataset
dataset: identity template: yi cutoff_len: 1024 max_samples: 1000 overwrite_cache: true preprocessing_num_workers: 16
output
output_dir: saves/Yi-1.5-9B-Chat/freeze/sft logging_steps: 1 save_steps: 500 plot_loss: true overwrite_output_dir: true
train
per_device_train_batch_size: 1 gradient_accumulation_steps: 8 learning_rate: 0.00005 num_train_epochs: 2 lr_scheduler_type: cosine warmup_steps: 0.1 fp16: true
eval
val_size: 0.1 per_device_eval_batch_size: 1 evaluation_strategy: steps eval_steps: 500
And I got the error message:
Traceback (most recent call last):
File "/home/ubuntu/python3.9/bin/llamafactory-cli", line 8, in num_layers
49 should be divisible by num_layer_trainable
2.
Expected behavior
I will work normally as we targeted to finetune the expanded 2 blocks.
System Info
No response
Others
No response