alpaca-lora
alpaca-lora copied to clipboard
Error when resuming from checkpoint, unable to load model.
I was trying to fine-tune the model using two distinct prompting methods. For this, I first trained a model using one corpus. Then, using the automatically saved checkpoints, I tried to fine-tune the model using another corpus using the following command:
python finetune.py --base_model 'decapoda-research/llama-7b-hf' --resume_from_checkpoint './my-model/checkpoint-8400/'
However, when I try to resume from the checkpoint, I get the following error:
Restarting from ./my-model/checkpoint-8400/pytorch_model.bin
Traceback (most recent call last):
File "finetune.py", line 277, in <module>
fire.Fire(train)
File "/data/tmp/astotxo/env/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/data/tmp/astotxo/env/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/data/tmp/astotxo/env/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "finetune.py", line 205, in train
model.print_trainable_parameters() # Be more transparent about the % of trainable params.
AttributeError: 'NoneType' object has no attribute 'print_trainable_parameters'
Am I doing something wrong? Is there a way to resume from a checkpoint and fine-tune the model?
Make sure you use the latest code. This could be because you have an older code, not compatible with current peft.
met the same issue...not be solved yet.
Make sure you use the latest code. This could be because you have an older code, not compatible with current peft.
I just made a pull and it seem to just solved the problem! Cheers!
I'll keep this issue open for @nkjulia seems to be facing a different problem, but otherwise fell free to close it.