Archmilio
Archmilio
I executed the following script but keep getting an error. python -m mii.entrypoints.openai_api_server \ --model "/logs/llama-2-70b-chat/" \ --port 8000 \ --host 0.0.0.0 \ --tensor-parallel 2 Traceback (most recent call last):...
Thank you for your hard work. I am really excited about MII performance. I have some questions Does token streaming function supported now? If token streaming is supported, I would...
### System Info I am testing using the TGI Tool Call. But The error continues to occur, can you check it? ### Information - [X] Docker - [ ] The...
### System Info - GPU Properties NVIDIA H100 - Libraries v0.20.0rc1 - Docker ### Who can help? _No response_ ### Information - [x] The official example scripts - [x] My...