alpaca-lora
alpaca-lora copied to clipboard
Try to infer from hf model, but producint nothing
Hi friends,
I tried to load a hf llama model and used the generation script (with slightly modification) which is provided on this repo for inference. But I got nothing. The following is my code:
from transformers import LlamaForCausalLM, LlamaTokenizer import torch
tokenizer = LlamaTokenizer.from_pretrained( "decapoda-research/llama-7b-hf", add_eos_token=True )
model = LlamaForCausalLM.from_pretrained( "decapoda-research/llama-7b-hf" )
model = model.to("cuda:1")
def generate_prompt(instruction, input=None): if input: return f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
Instruction:
{instruction}
Input:
{input}
Response:"""
else:
return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.
Instruction:
{instruction}
Response:"""
prompt = generate_prompt("1 + 1 eauals to?")
inputs = tokenizer(prompt, return_tensors="pt") input_ids = inputs["input_ids"].to("cuda:1")
with torch.no_grad(): generation_output = model.generate( input_ids=input_ids, temperature=0.1, top_p=0.75, top_k=40, num_beams=4, return_dict_in_generate=True, output_scores=True, max_new_tokens=512 )
s = generation_output.sequences[0] output = tokenizer.decode(s)
After executing all codes at above and printing the output, it shows:
' Below is an instruction that describes a task. Write a response that appropriately completes the request.\n### Instruction:\n1 + 1 eauals to?\n### Response:'
which means nothing is generated