aici
aici copied to clipboard
very low temperature causes crash in sampling
trafficstars
logits tensor is float16, we use -100 to ban a token. Temperature setting below around 0.0003 causes overflow and the following crash:
File "/workspaces/aici/vllm/vllm/model_executor/layers/sampler.py", line 409, in _sample
parent_seq_ids, next_token_ids = _sample_from_generation_tokens(
File "/workspaces/aici/vllm/vllm/model_executor/layers/sampler.py", line 356, in _sample_from_generation_tokens
next_token_ids = torch.multinomial(probs,
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0