flan-alpaca-lora
flan-alpaca-lora copied to clipboard
Training script takes more than 2 hours to finish
Hi. Thanks for your nice work!
I've tried to run your training script on a RTX3090 with exact dependencies as you suggested. It turned out that it took more than 2 hours to finish instead of 20 minutes. I also tried training flan-t5-large and it took more than 4 hours. What can be the reasons for this?
It is hard to locate the problem without details of the machine runing the code. There might be several possible reasons: different datasets, different cuda version, cpu bottleneck, databus bottleneck, older GPU may suffer performance drop... However, if you can run the code smoothly, time may not be a problem since you just let it run.
Thanks for your answer. I just thought it shouldn't be that different :)