FastChat
FastChat copied to clipboard
How to set random seed number to get consistent output on Vicuna-13b-v1.3?
Hi, I am using Vicuna-13b-v1.3 (LLaMA 1) model and found that the output generated is inconsistent even using the same input prompt. However, I was unable to find relevant support documentation on how to get fixed consistent output with the same prompt every time. How can we set the fixed random seed number on inference so that we can get reproducible results when using same input prompt?
Here is the sample code I used. If anyone could help to advice would be much appreciated. Thank you!
class Vicuna():
def __init__(self):
print('Initialize Vicuna...')
self.model, self.tokenizer = load_model(
'lmsys/vicuna-13b-v1.3',
device='cuda',
num_gpus=1
)
@torch.inference_mode()
def respond(self, input_msg):
conv = get_conversation_template('lmsys/vicuna-13b-v1.3')
conv.append_message(conv.roles[0], input_msg)
conv.append_message(conv.roles[1], None)
prompt = conv.get_prompt()
input_ids = self.tokenizer([prompt]).input_ids
output_ids = self.model.generate(
torch.as_tensor(input_ids).cuda(),
do_sample=True,
temperature=0.001,
repetition_penalty=1.0,
max_new_tokens=512,
)
output_ids = output_ids[0][len(input_ids[0]) :]
outputs = self.tokenizer.decode(
output_ids, skip_special_tokens=True, spaces_between_special_tokens=False
)
return outputs
I have the same issue