FastChat Does the FastChat model support gemma2-27b it?

When I tried to use fastchat.serve.cli, the error was: `

root@4034937c8c66:/mnt/fastchat/FastChat-main# CUDA_VISIBLE_DEVICES=3 python3 -m fastchat.serve.cli --model /mnt/gemma2Loading checkpoint shards: 100%|████████████████████████████████████████████████████████| 12/12 [01:08<00:00, 5.70s/it] user: hello model: Traceback (most recent call last): File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/mnt/fastchat/FastChat-main/fastchat/serve/cli.py", line 304, in main(args) File "/mnt/fastchat/FastChat-main/fastchat/serve/cli.py", line 227, in main chat_loop( File "/mnt/fastchat/FastChat-main/fastchat/serve/inference.py", line 532, in chat_loop outputs = chatio.stream_output(output_stream) File "/mnt/fastchat/FastChat-main/fastchat/serve/cli.py", line 63, in stream_output for outputs in output_stream: File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context response = gen.send(None) File "/mnt/fastchat/FastChat-main/fastchat/serve/inference.py", line 190, in generate_stream indices = torch.multinomial(probs, num_samples=2) RuntimeError: probability tensor contains either inf, nan or element < 0

`

I think the model should be fine because I can get the results normally using the demo on Huggingface In addition, similar errors may occur when deploying using the openai'api_derver method { "object": "error", "message": "NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.\n\n(probability tensor contains either inf, nan or element < 0)", "code": 50001 }

Jul 26 '24 08:07 zhouyuustc

I have the same issue with both 9b and 27b

Oct 06 '24 14:10 anjifenjou

Have the same issue with 27B ...

Oct 31 '24 04:10 dongyun-kim-arch

Have the same issue with 27B ...

how to solve that?

Dec 23 '24 10:12 DAAworld

please follow this issue gyus -> https://github.com/meta-llama/llama/issues/380

Feb 08 '25 10:02 leeivan1007