LLaMA-LoRA-Tuner icon indicating copy to clipboard operation
LLaMA-LoRA-Tuner copied to clipboard

Aborting training should delete output folder

Open l0rinc opened this issue 2 years ago • 3 comments

Screenshot 2023-04-19 at PM 4 55 11

l0rinc avatar Apr 19 '23 14:04 l0rinc

Actually, this is somehow intended to force each train, once started, to have a unique name, which can solve a few things:

  • Ensures unique run name on Wandb (the model name is used as run name).
  • Avoid issues with model caching since the model name is used as the cache key (?).

It also has some benefits by not deleting the output folder for an aborted train:

  • Fine-tuning parameters stored in the output folder can be preserved, which can be loaded back to start the next train conveniently.
  • Checkpoints are preserved, making it possible to resume the training.

zetavg avatar Apr 19 '23 18:04 zetavg

I personally just abort when I see something behaving differently - and rm -rfd from colab's terminal manually. Maybe a "would you like to overwrite" or "rename old one" could make sense - if you don't think that's a good idea, please close the issue :)

l0rinc avatar Apr 19 '23 20:04 l0rinc

This makes sense, I'll add this along with the CLI interface!

zetavg avatar Apr 25 '23 21:04 zetavg