Streaming output support?

Open xiangqi1997 opened this issue 1 year ago • 1 comments

Wondering if streaming output is supported? Or are there any results about the time to first token and time per output token? Thanks.

May 10 '24 02:05 xiangqi1997

Copying the code of def stream_chat(self, ....) to modeling_internvl_chat.py from modeling_internlm2.py https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5/blob/main/modeling_internlm2.py and make very small changes , I implement it and verify that is usefull.

May 13 '24 13:05 NiYueLiuFeng

@NiYueLiuFeng Can you share your modeling_internvl_chat.py？ Thank you very much

Jun 28 '24 07:06 LatentLinker

Hi, see this guide for streaming output: https://internvl.readthedocs.io/en/latest/internvl2.0/quick_start.html#streaming-output

Jul 30 '24 13:07 czczup