OpenChatKit
OpenChatKit copied to clipboard
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
my environment is GPU: V100-32G torch: 1.13.1+cu116 python: 3.7.13
I load the model by using Int8: tokenizer = AutoTokenizer.from_pretrained("togethercomputer/GPT-NeoXT-Chat-Base-20B") model = AutoModelForCausalLM.from_pretrained("togethercomputer/GPT-NeoXT-Chat-Base-20B", device_map="auto", load_in_8bit=True)
And when run the model.generate the error will occur: RuntimeError: probability tensor contains either inf
, nan
or element < 0 at "next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)"
I printed the values and found that from "next_token_scores = logits_processor(input_ids, next_token_logits)", all the elements of the tensor is "nan".
I have the same problem. inputs
are all valid (not nan
, not inf
, and >0), but probs
seems to be all nan
:
(Pdb) probs[0,-100:]
tensor([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan], device='cuda:0', dtype=torch.float16)
This problem seems to be caused by incorrect configuration of the Converting weights to Hugging Face format step
--n-stages can be found from checkpoint folder, in my environment --n-stages 2
--n-layer-per-stage must refer to the model configuration
--n-layer-per-stage = model layers / --n-stages
, in my environment --n-layer-per-stage 16
https://huggingface.co/EleutherAI/pythia-6.9b-deduped