FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

How to set random seed number to get consistent output on Vicuna-13b-v1.3?

Open ee2110 opened this issue 1 year ago • 1 comments

Hi, I am using Vicuna-13b-v1.3 (LLaMA 1) model and found that the output generated is inconsistent even using the same input prompt. However, I was unable to find relevant support documentation on how to get fixed consistent output with the same prompt every time. How can we set the fixed random seed number on inference so that we can get reproducible results when using same input prompt?

Here is the sample code I used. If anyone could help to advice would be much appreciated. Thank you!

class Vicuna():
    def __init__(self):
        print('Initialize Vicuna...')
        self.model, self.tokenizer = load_model(
            'lmsys/vicuna-13b-v1.3',
            device='cuda',
            num_gpus=1
        )

    @torch.inference_mode()
    def respond(self, input_msg):
        conv = get_conversation_template('lmsys/vicuna-13b-v1.3')
        conv.append_message(conv.roles[0], input_msg)
        conv.append_message(conv.roles[1], None)
        prompt = conv.get_prompt()

        input_ids = self.tokenizer([prompt]).input_ids
        output_ids = self.model.generate(
            torch.as_tensor(input_ids).cuda(),
            do_sample=True,
            temperature=0.001,
            repetition_penalty=1.0,
            max_new_tokens=512,
        )

        output_ids = output_ids[0][len(input_ids[0]) :]
        outputs = self.tokenizer.decode(
            output_ids, skip_special_tokens=True, spaces_between_special_tokens=False
        )
        return outputs

ee2110 avatar Apr 03 '24 09:04 ee2110

I have the same issue

lsy641 avatar Apr 13 '24 22:04 lsy641