alpaca-lora
alpaca-lora copied to clipboard
setting do_sample=True in GenerationConfig generates errors
When I set do_sample=True in GenerationConfig, I get the error:
in lib/python3.8/site-packages/transformers/generation/utils.py", line 3187, in beam_sample next_tokens = torch.multinomial(probs, num_samples=2 * num_beams) RuntimeError: probability tensor contains either
inf,
nan or element < 0
I tried changing values of temperature, top_p, top_k and num_beams but this does not seem to solve it.
Not using do_sample works. But then the output is fixed (even with changing temperature). If I provide only two parameters: temperature=a float value and do_sample=True, in GenerationConfig, this also works. Is there a conflict when all parameters are specified together? (I am using GPU - if that is of any significance).
Could someone suggest what might be wrong?
I have the same problem too 😭.
This error may be casued by the beam_scores
is increased linearly with length. Refer to beam_sample throws a nan error on long generations.
I think you can set num_beams
=1 or set do_sample=False
directly.
i've found that it's related to the temperature
parameter. If you set temperature to a high value (e.g., the default value of 1.0
), the error goes away.
This error may be casued by the
beam_scores
is increased linearly with length. Refer to beam_sample throws a nan error on long generations.I think you can set
num_beams
=1 or setdo_sample=False
directly.
Setting these parameters returns text, but gibberish unfortunately, even for a high temperature value, see https://github.com/huggingface/transformers/issues/22914#issuecomment-1562034753
Edit: It's a CUDA 11.8 issue with multi GPU and bitsandbytes. Downgrade bitsandbytes to 0.31.8 and downgrade CUDA to 11.6, see referenced issue.