FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

Does the FastChat model support gemma2-27b it?

Open zhouyuustc opened this issue 1 year ago • 4 comments
trafficstars

When I tried to use fastchat.serve.cli, the error was: `

root@4034937c8c66:/mnt/fastchat/FastChat-main# CUDA_VISIBLE_DEVICES=3 python3 -m fastchat.serve.cli --model /mnt/gemma2Loading checkpoint shards: 100%|████████████████████████████████████████████████████████| 12/12 [01:08<00:00, 5.70s/it] user: hello model: Traceback (most recent call last): File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/mnt/fastchat/FastChat-main/fastchat/serve/cli.py", line 304, in main(args) File "/mnt/fastchat/FastChat-main/fastchat/serve/cli.py", line 227, in main chat_loop( File "/mnt/fastchat/FastChat-main/fastchat/serve/inference.py", line 532, in chat_loop outputs = chatio.stream_output(output_stream) File "/mnt/fastchat/FastChat-main/fastchat/serve/cli.py", line 63, in stream_output for outputs in output_stream: File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 35, in generator_context response = gen.send(None) File "/mnt/fastchat/FastChat-main/fastchat/serve/inference.py", line 190, in generate_stream indices = torch.multinomial(probs, num_samples=2) RuntimeError: probability tensor contains either inf, nan or element < 0

`

I think the model should be fine because I can get the results normally using the demo on Huggingface In addition, similar errors may occur when deploying using the openai'api_derver method { "object": "error", "message": "NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.\n\n(probability tensor contains either inf, nan or element < 0)", "code": 50001 }

zhouyuustc avatar Jul 26 '24 08:07 zhouyuustc