server crashed for some reason, unable to proceed
I executed the following script but keep getting an error.
python -m mii.entrypoints.openai_api_server
--model "/logs/llama-2-70b-chat/"
--port 8000
--host 0.0.0.0
--tensor-parallel 2
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.10/dist-packages/mii/entrypoints/openai_api_server.py", line 506, in
I don't know how to solve this issue.
Hi @Archmilio could you try running the model in a pipeline? I suspect that the server is crashing when loading the model, but since it is a separate process the real error is not being shown:
import mii
pipe = mii.pipeline("/logs/llama-2-70b-chat/", tensor_parallel=2)
print(pipe("DeepSpeed is"))
Run this example with deepspeed --num_gpus 2 example.py