DeepSpeed-MII
DeepSpeed-MII copied to clipboard
Support for token streaming
Thank you for your hard work. I am really excited about MII performance.
I have some questions
Does token streaming function supported now?
If token streaming is supported, I would like to test the first token latency and completion time. Do you happen to know when it will be supported?