Chinese-LLaMA-Alpaca
Chinese-LLaMA-Alpaca copied to clipboard
num_beam=2, RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
- 模型推理问题(🤗 transformers)
以float16模式加载模型到V100显卡上,运行示例没有问题。但是将num_beam=1改成num_beam=2之后会出现RuntimeError: probability tensor contains either inf
, nan
or element < 0。
将 https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/inference_hf.py 中的num_beams=1改成num_beam=2可以复现问题。
可能是transformers的问题,也可能是模型权重的问题,目前尚不清楚,正在排查,可以参考https://github.com/oobabooga/text-generation-webui/issues/199
same problem. My generation config: do_sample=True, beams>1(using beam_sample) It turns to be normal when tweak beams=1(using sample). Didn't try if tweak do_sample to False, but it seems working in others' issues.
Note: this error appears both in merged weights or after personal finetuned.
@ZenBuilds @jzsbioinfo
Did you encounter the same problem when inferring with the original LLaMA with beam size set to 2? With command like follows
python inference_hf.py --base_model path_to_llama_7b_hf --interactive
@ZenBuilds @jzsbioinfo
Did you encounter the same problem when inferring with the original LLaMA with beam size set to 2? With command like follows
python inference_hf.py --base_model path_to_llama_7b_hf --interactive
Have tried your command using llama-7b-hf, and set beams=2, it happens again.
parameters_of_generation:
generation_config = dict(
temperature=0.2,
top_k=40,
top_p=0.9,
do_sample=True,
**num_beams=2,**
repetition_penalty=1.3,
max_new_tokens=400
)
The error looks like:
===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run
python -m bitsandbytes
and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
bin /home/tiger/.local/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda120.so
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 120
CUDA SETUP: Loading binary /home/tiger/.local/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda120.so...
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 33/33 [00:47<00:00, 1.43s/it]
Vocab of the base model: 32000
Vocab of the tokenizer: 32000
Input:hi
/home/tiger/.local/lib/python3.9/site-packages/transformers/generation/utils.py:1219: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)
warnings.warn(
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.
Traceback (most recent call last):
File "/opt/tiger/startbash/Chinese-LLaMA-Alpaca/scripts/inference_hf.py", line 104, in <module>
generation_output = model.generate(
File "/home/tiger/.local/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/generation/utils.py", line 1562, in generate
return self.beam_sample(
File "/home/tiger/.local/lib/python3.9/site-packages/transformers/generation/utils.py", line 3187, in beam_sample
next_tokens = torch.multinomial(probs, num_samples=2 * num_beams)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
我重新在A100进行了测试
do_sample=False, num_beams=2, 没有问题 do_sample=True, num_beams=1, 没有问题 do_sample=True, num_beams=2, 有问题
根据 https://huggingface.co/docs/transformers/main_classes/text_generation 就是说进行 multinomial sampling 就会报错。目前不知道怎么解决
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.
Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.