Does the FastChat model support gemma2-27b it?
When I tried to use fastchat.serve.cli, the error was: `
root@4034937c8c66:/mnt/fastchat/FastChat-main# CUDA_VISIBLE_DEVICES=3 python3 -m fastchat.serve.cli --model /mnt/gemma2Loading checkpoint shards: 100%|████████████████████████████████████████████████████████| 12/12 [01:08<00:00, 5.70s/it]
user: hello
model: Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/mnt/fastchat/FastChat-main/fastchat/serve/cli.py", line 304, in inf, nan or element < 0
`
I think the model should be fine because I can get the results normally using the demo on Huggingface
In addition, similar errors may occur when deploying using the openai'api_derver method
{
"object": "error",
"message": "NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.\n\n(probability tensor contains either inf, nan or element < 0)",
"code": 50001
}
I have the same issue with both 9b and 27b
Have the same issue with 27B ...
Have the same issue with 27B ...
how to solve that?
please follow this issue gyus -> https://github.com/meta-llama/llama/issues/380