FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

Open LovingThresh opened this issue 1 year ago • 9 comments

RuntimeError: probability tensor contains either inf, nan or element < Whatever I input, it will raise this RuntimeError Human: what can you do? Assistant: │ 101 │ token = int(torch.argmax(last_token_logits))
│ 102 │ else: │ 103 │ probs = torch.softmax(last_token_logits / temperature, dim=-1) │ 104 │ token = int(torch.multinomial(probs, num_samples=1)) │ 105 │ │ 106 │ output_ids.append(token)

╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ RuntimeError: probability tensor contains either inf, nan or element < 0

LovingThresh avatar Apr 07 '23 12:04 LovingThresh

what hardware are you running on?

zhisbug avatar Apr 07 '23 21:04 zhisbug

i try it in a NGC docker in Ubuntu 22.04. I also try Windows,but all fail.

| | 刘烨 中南大学本科生 | | 15200945180 @.*** |

---- Replied Message ---- | From | Hao @.> | | Date | 04/08/2023 05:05 | | To | @.> | | Cc | LIU @.>@.> | | Subject | Re: [lm-sys/FastChat] RuntimeError: probability tensor contains either inf, nan or element < 0 (Issue #272) |

what hardware are you running on?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

LovingThresh avatar Apr 08 '23 04:04 LovingThresh

https://github.com/lm-sys/FastChat/issues/153

djaffer avatar Apr 08 '23 21:04 djaffer

#153

@djaffer I followed the added_tokens.json way and sadly it didn't work on my case. Did this work on yours?

oasis-0927 avatar Apr 10 '23 09:04 oasis-0927

no it didn't.

djaffer avatar Apr 10 '23 09:04 djaffer

Found a solution that might work. I downloaded the vicuna weights from https://huggingface.co/AlekseyKorshuk/vicuna-7b/ instead of applying the delta weights on llama-7b model and it worked. My guess is that this error is result from either the apply_delta process or the llama-7b model is not compatible with the vicuna-delta weights.

Hope it helps.

oasis-0927 avatar Apr 11 '23 09:04 oasis-0927

thank you!but we use the 13B, and where can find the weight.

| | 刘烨 中南大学本科生 | | 15200945180 @.*** |

---- Replied Message ---- | From | @.> | | Date | 04/11/2023 17:23 | | To | @.> | | Cc | LIU @.>@.> | | Subject | Re: [lm-sys/FastChat] RuntimeError: probability tensor contains either inf, nan or element < 0 (Issue #272) |

Found a solution that might work. I downloaded the vicuna weights from https://huggingface.co/AlekseyKorshuk/vicuna-7b/ instead of applying the delta weights on llama-7b model and it worked. My guess is that this error is result from either the apply_delta process or the llama-7b model is not compatible with the vicuna-delta weights.

Hope it helps.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

LovingThresh avatar Apr 11 '23 10:04 LovingThresh

@LovingThresh Maybe this solution is worth a try. https://zhuanlan.zhihu.com/p/620801429

oasis-0927 avatar Apr 12 '23 02:04 oasis-0927

Thank you @JuntingGuo, It work in CPU; single GPU int8, but not in muli GPUs. It looks some works to do?

LovingThresh avatar Apr 12 '23 11:04 LovingThresh

Is this issue resolved for all of you? Have you tried the new weights and apply_delta script since our v1.1 release?

zhisbug avatar Apr 21 '23 01:04 zhisbug

this problem has been solved.

---- Replied Message ---- | From | Hao @.> | | Date | 04/21/2023 09:58 | | To | @.> | | Cc | LIU @.>@.> | | Subject | Re: [lm-sys/FastChat] RuntimeError: probability tensor contains either inf, nan or element < 0 (Issue #272) |

Is this issue resolved for all of you? Have you tried the new weights and apply_delta script since our v1.1 release?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

LovingThresh avatar Apr 21 '23 03:04 LovingThresh

Great to hear @LovingThresh !

zhisbug avatar Apr 21 '23 23:04 zhisbug

Meet the same problem, solved by load model in single GPU in 8bit. 4bit will cause the problem.

ZhaoChuyang avatar Dec 28 '23 12:12 ZhaoChuyang