LMFlow How to continue fine-tuning a language model based on a pre-finetuned model?

I've been fine-tuning a model with LoRa and now I want to continue that process for another related task, using a different dataset. I noticed that there is a pre-finetuned model (i.e., LLaMA13B-medical you provided :) ) available for my use case. Can you guide me on how to continue fine-tuning based on this model? What command line arguments and parameters should I use when running the fine-tuning script? Do I need to modify any files or scripts? Your help and suggestions would be greatly appreciated. Thank you!

Apr 14 '23 14:04 yang-dongxu

Thanks for your interest in LMFlow! Currently we support continuous LoRA training. You may refer to ./scripts/run_finetune_with_lora.sh, with --lora_model_path {path-to-your-lora-model} added. This shall further finetune the LoRA model. If you would like the outputed model to be a full model instead of base model + LoRA model, you may refer to ./scripts/run_finetune_with_lora_save_aggregated_weights.sh, similarly, with --lora_model_path {path-to-your-lora-model} added.

Hope that answers your questions. Thanks 😄

Apr 14 '23 16:04 research4pan

Sorry,i can't find anything about --lora_model_path {path-to-your-lora-model}. Do you mean to add this parameter in it?

Apr 15 '23 08:04 w-JiqQian

Thank you for your response, however, I encountered an error when attempting to use ./scripts/run_finetune_with_lora_save_aggregated_weights.sh:

Traceback (most recent call last):
File "/venv/lib/python3.9/site-packages/peft/peft_model.py", line 287, in getattr
return super().getattr(name) # defer to nn.Module's logic
File "/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1614, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'PeftModelForCausalLM' object has no attribute 'merge_and_unload'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/venv/lib/python3.9/site-packages/peft/tuners/lora.py", line 211, in getattr
return super().getattr(name) # defer to nn.Module's logic
File "/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1614, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'LoraModel' object has no attribute 'merge_and_unload'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/LMFlow/examples/finetune.py", line 70, in <module>
main()
File "/LMFlow/examples/finetune.py", line 66, in main
tuned_model = finetuner.tune(model=model, lm_dataset=lm_dataset)
File "/LMFlow/src/lmflow/pipeline/finetuner.py", line 238, in tune
model.merge_lora_weights()
File "/LMFlow/src/lmflow/models/hf_decoder_model.py", line 415, in merge_lora_weights
self.get_backend_model().merge_and_unload()
File "/venv/lib/python3.9/site-packages/peft/peft_model.py", line 289, in getattr
return getattr(self.base_model, name)
File "/venv/lib/python3.9/site-packages/peft/tuners/lora.py", line 213, in getattr
return getattr(self.model, name)
File "/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1614, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'LlamaForCausalLM' object has no attribute 'merge_and_unload'
[2023-04-17 09:14:10,478] [INFO] [[launch.py:318](http://launch.py:318/):sigkill_handler] Killing subprocess 4100
[2023-04-17 09:14:10,480] [ERROR] [[launch.py:324](http://launch.py:324/):sigkill_handler] ['/venv/bin/python3.9', '-u', 'examples/finetune.py', '--local_rank=0', '--model_name_or_pat
h', 'basic_models/vicuna-7b-1.1', '--dataset_path', '/LMFlow/data/UniProtQA/train', '--output_dir', '/LMFlow/output_models/finetune_with_lora_agg/UniProtQA',
'--overwrite_output_dir', '--num_train_epochs', '3', '--learning_rate', '1e-4', '--block_size', '512', '--per_device_train_batch_size', '1', '--use_lora', '1'
, '--lora_r', '8', '--save_aggregated_lora', '1', '--deepspeed', 'configs/ds_config_zero2.json', '--bf16', '--run_name', 'finetune_with_lora', '--validation_s
plit_percentage', '0', '--logging_steps', '20', '--do_train', '--ddp_timeout', '72000', '--save_steps', '50000', '--dataloader_num_workers', '2'] exits with r
eturn code = 1

Thanks for your interest in LMFlow! Currently we support continuous LoRA training. You may refer to ./scripts/run_finetune_with_lora.sh, with --lora_model_path {path-to-your-lora-model} added. This shall further finetune the LoRA model. If you would like the outputed model to be a full model instead of base model + LoRA model, you may refer to ./scripts/run_finetune_with_lora_save_aggregated_weights.sh, similarly, with --lora_model_path {path-to-your-lora-model} added.

Hope that answers your questions. Thanks 😄

Apr 17 '23 10:04 yang-dongxu

Thanks for providing more details! The .merge_and_unload is a method supported by PeftModel. Could you please provide the peft version via pip show peft, so we may check that for you? Thanks 😄

Apr 19 '23 19:04 research4pan

Hello, below is the output from the command pip show peft:

Name: peft
Version: 0.3.0.dev0
Summary: Parameter-Efficient Fine-Tuning (PEFT)
Home-page: https://github.com/huggingface/peft
Author: The HuggingFace team
Author-email: [[email protected]](mailto:[email protected])
License: Apache
Location: /venv/lib/python3.9/site-packages
Requires: torch, transformers, numpy, accelerate, packaging, pyyaml, psutil
Required-by: lmflow

Additionally, I ran the script .merge_and_unload in the docker image that you provided. Does this mean that the docker image needs to be updated?"

Apr 20 '23 02:04 yang-dongxu

This issue has been marked as stale because it has not had recent activity. If you think this still needs to be addressed please feel free to reopen this issue. Thanks

Jun 19 '23 10:06 shizhediao