intel-extension-for-transformers feature request: support for TextIterationstreamer of HF

feature request: support for TextIterationstreamer of HF

Open RachelShalom opened this issue 1 year ago • 4 comments

trafficstars

Hi I want to be able to stream the model not only to stdoutput. the current streamer: TextStreamer only works with the stdouput as I understands it. I tried using the TextIterationStreamer but the current code does not support it

here is a reference:

https://huggingface.co/docs/transformers/v4.36.1/en/internal/generation_utils#transformers.TextIteratorStreamer

https://github.com/huggingface/transformers/blob/fc5b7419d4c8121d8f1fa915504bcc353422559e/src/transformers/generation/streamers.py#L125

I think supporting it is important in order to be able to use web application. I am trying to demonstrate this intel mdels performances with streamlit but I can't stream..

Jan 01 '24 12:01 RachelShalom

Thank to your feedback, we will support it.

Jan 03 '24 05:01 kevinintel

@RachelShalom Hi Rachel. I am also trying to demonstrate the inference speed of the llm on intel. Were you able to find any walk around or other method to stream the tokens?

Feb 27 '24 18:02 AdityaKulshrestha

@AdityaKulshrestha I assume there are servings options. @kevinintel did you guys decided to work on it?

Mar 03 '24 08:03 RachelShalom

Yes, we will support it recently. After the fature enabled, I will upsate in this issue

Mar 04 '24 03:03 kevinintel

intel-extension-for-transformers intel-extension-for-transformers copied to clipboard

feature request: support for TextIterationstreamer of HF

intel-extension-for-transformers
intel-extension-for-transformers copied to clipboard