LLaMA-LoRA-Tuner icon indicating copy to clipboard operation
LLaMA-LoRA-Tuner copied to clipboard

Inference got error "probability tensor contains either `inf`, `nan` or element < 0" while using beams=2 with temperature=0.1

Open zetavg opened this issue 2 years ago • 2 comments

Using decapoda-research/llama-7b-hf, beams = 2 with temperature = 0.1:

Note: Error will not be shown if Stream Output is enabled. If Stream Output is enabled, it will just output nothing.

beams = 2 with temperature = 0.4 also got this error, however, beams = 2 with temperature = 0.5 will not.

(unhelpful-ai-v01-3)

zetavg avatar Apr 17 '23 17:04 zetavg

Same on the trained LoRA model:

Traceback (most recent call last):
  File "/content/llama_lora/llama_lora/lib/streaming_generation_utils.py", line 47, in gentask
    ret = self.mfunc(callback=_callback, **self.kwargs)
  File "/content/llama_lora/llama_lora/lib/inference.py", line 59, in generate_with_callback
    generation_output = model.generate(**kwargs)
  File "/usr/local/lib/python3.9/dist-packages/peft/peft_model.py", line 631, in generate
    outputs = self.base_model.generate(**kwargs)
  File "/usr/local/lib/python3.9/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py", line 1562, in generate
    return self.beam_sample(
  File "/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py", line 3187, in beam_sample
    next_tokens = torch.multinomial(probs, num_samples=2 * num_beams)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1020, in postprocess_data
    if predictions[i] is components._Keywords.FINISHED_ITERATING:
IndexError: tuple index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/gradio/routes.py", line 393, in run_predict
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1111, in process_api
    data = self.postprocess_data(fn_index, result["prediction"], state)
  File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1024, in postprocess_data
    raise ValueError(
ValueError: Number of output components does not match number of values returned from from function do_inference

l0rinc avatar Apr 21 '23 12:04 l0rinc

@paplorinc I think that's another error that I forgot to update the return value at some point of the do_inference function, causing Gradio to error at some chances. I noticed this a few days ago and fixed it on the main branch. Let me know if it still happens for you!

zetavg avatar Apr 25 '23 21:04 zetavg