transformers.js
transformers.js copied to clipboard
[Feature request] streamer callback for text-generation task
Streamer https://huggingface.co/docs/transformers/generation_strategies#streaming
Reason for request
Currently, iterating max_new_tokens: 1
takes much longer time than single generation. Text generation takes time even for light model. Token streaming is key feature for user experience. In my case. Task-specific text generation could be a key feature of AI app development using transformers.js with low cost.
Additional context
I'm not sure the TextStreamer
class need to be compatibility with python transformers. I wrote an use case proposal with TextStreamer extends TransformStream
. AsyncIterable, AsynsGeneator and Stream API might be usable.
Suggesting streaming code
let aggregatedResponse = '';
const streamer = TextStreamer()
const pipe: QuestionAnsweringPipeline = await pipeline(
'text-generation',
model,
{ quantized: true }
)
pipe(prompt, { streamer: streamer })
res = new Response(streamer.readable)
This is vercel's approach https://github.com/vercel/ai/blob/main/packages/core/streams/ai-stream.ts https://github.com/vercel-labs/ai-chatbot/blob/main/app/api/chat/route.ts
Hi there 👋 I definitely think the addition of an equivalent TextStreamer
class to the library will be great! If someone in the community would like to contribute this, it should be as simple as rewriting this file in JavaScript.
The current approach to text streaming (which was actually added before the python library added TextStreamer
) is to add a callback_function
to the generate/pipeline function. For example:
const pipe = await pipeline(
'text-generation',
model,
{ quantized: true }
)
pipe(prompt, { callback_function: beams => { console.log(beams) }})
Here's an example of streaming + decoding:
https://github.com/xenova/transformers.js/blob/4e4148cb5ce7f4a9265f58b4eeb660c64bed0386/examples/demo-site/src/worker.js#L189-L202
@xenova How to define the callback_function to make the text-generation stop at special words(like the openai api's "stop" param) I also find the code from transformer.js you shows, but I am confused about how to do this next
@xenova is this issue still open for contribution?