FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

Fine-tuning: Can train_mem.py run on CPU

Open tonyaw opened this issue 2 years ago • 2 comments
trafficstars

I want to fine-tuning vicuna via train_mem.py. It requires module flash_attn, and flash_attn requires nvcc. Based on that, I assume train_mem.py can only run on GPU. Is my understanding right?

Anyway to fine-tuning vicuna via CPU? I don't care the training time as I don't have GPU. :-)

tonyaw avatar Apr 21 '23 09:04 tonyaw

you can replace flash_atten with the normal attention in pytorch. Things will still work, despite that the training speed will be very slow.

zhisbug avatar May 08 '23 09:05 zhisbug

you can replace flash_atten with the normal attention in pytorch. Things will still work, despite that the training speed will be very slow.

Can you give detailed steps, thank you

chamuyaye avatar Jun 13 '23 01:06 chamuyaye

@zhisbug can you please suggest the code changes we are supposed to make?

saurabhmahra91 avatar Jun 23 '23 08:06 saurabhmahra91