stanford_alpaca icon indicating copy to clipboard operation
stanford_alpaca copied to clipboard

How to inference after finetuning

Open chenzuozhou opened this issue 2 years ago • 4 comments

How to inference after finetuning?

chenzuozhou avatar Apr 24 '23 08:04 chenzuozhou

check out this PR: https://github.com/tatsu-lab/stanford_alpaca/pull/199/files

tonyzhao6 avatar Apr 24 '23 17:04 tonyzhao6

@FruVirus how to convert the fine tuned model to a form load-able by the script? what is the command to run?

yxchng avatar Jun 25 '23 03:06 yxchng

Thanks for the link!

However, I had some problems when I run the code in my server with three 3090 GPUs with VRAM of 24GB*3. I solved the error of out of memory by commenting out the line model.cuda(). Then I solved the error "Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!" by commenting out the line num_beams=4, .

I know model.cuda() will set all model to the first GPU. But what happend when I commenting out the line num_beams=4 ? Why it can fix the error?

BaoBaoGitHub avatar Jul 21 '23 09:07 BaoBaoGitHub

It seems like model is loaded to the device during transformers.AutoModelForCausalLM.from_pretrained

and num_beams error is caused by 'inf' 'nan' or ele<0 in my error info, I have no idead about that

Pegessi avatar Aug 28 '23 06:08 Pegessi