WizardLM
WizardLM copied to clipboard
Exactly same output generations for the same prompt
I was trying inference on the HumanEval dataset using WizardCoder/src/humaneval_gen.py
with the following parameters (state of "generation_config"):
{'max_length': 2048,
'max_new_tokens': 1000,
'min_length': 0,
'min_new_tokens': None,
'early_stopping': False,
'max_time': None,
'do_sample': True,
'num_beams': 5,
'num_beam_groups': 1,
'penalty_alpha': None,
'use_cache': True,
'temperature': 10.1,
'top_k': 50,
'top_p': 0.95,
'typical_p': 1.0,
'epsilon_cutoff': 0.0,
'eta_cutoff': 0.0,
'diversity_penalty': 0.0,
'repetition_penalty': 1.0,
'encoder_repetition_penalty': 1.0,
'length_penalty': 1.0,
'no_repeat_ngram_size': 0,
'bad_words_ids': None,
'force_words_ids': None,
'renormalize_logits': False,
'constraints': None,
'forced_bos_token_id': None,
'forced_eos_token_id': None,
'remove_invalid_values': False,
'exponential_decay_length_penalty': None,
'suppress_tokens': None,
'begin_suppress_tokens': None,
'forced_decoder_ids': None,
'sequence_bias': None,
'guidance_scale': None,
'num_return_sequences': 20,
'output_attentions': False,
'output_hidden_states': False,
'output_scores': False,
'return_dict_in_generate': False,
'pad_token_id': 49152,
'bos_token_id': None,
'eos_token_id': 0,
'encoder_no_repeat_ngram_size': 0,
'decoder_start_token_id': None,
'generation_kwargs': {},
'_from_model_config': False,
'_commit_hash': None,
'transformers_version': '4.31.0'}
All 20 generations for the prompt seem to be exactly the same. I have tried setting a very high temperature (around 10), a high top_p rate (0.99) and the observation still persists. Am I doing something wrong or are the model outputs highly deterministic ?
Have you tried more samples?
Have you tried more samples?
I did try this for different prompts (including non-coding based instructions).
I think you have done some wrong, but I cannot figure it out from your config. We check the generated results on humaneval with n=20. They are not the same.