Support Buffers, Blobs, or Streams inside experimental_streamData, not just JSON keys.
Feature Description
Not sure the technical constraints, maybe impossible... but I'll show my use case that would be heavily improved and my bottleneck I ran into.
I'm using PlayHT AI audio and want to attach the audio data alongside the text. Latency is important, I want to do everything at once, inside the stream.
The major line in question is:
data.append({
voiceData: Buffer.from(await resp.arrayBuffer()).toString("base64"),
});
You can see how I'm hacking a Buffer, then I decode back to audio on frontend client side because data only supports JSON values.
Some may say, use blob storage... I tried writing to vercel blob instead and pass URL, but I found base64 was still faster.
Ideally, no conversions... I am able to send a Blob or Buffer directly in data would be very cool!
Here is an example of my API:
export async function POST(req: Request) {
// Extract the `messages` from the body of the request
const { messages, personaName } = await req.json();
// Request the OpenAI API for the response based on the prompt
const aiResponse = await openai.chat.completions.create({
model: "gpt-3.5-turbo",
stream: true,
messages: messages,
});
const data = new experimental_StreamData();
const persona = await prisma.persona.findFirst({
where: { name: personaName },
});
const stream = OpenAIStream(aiResponse, {
onFinal: async (completion) => {
const voicesFiltered = voices.filter(
(v) =>
v.voice_engine === "PlayHT2.0" &&
v.gender === persona?.gender &&
v.accent === persona?.accent
);
const resp = await fetch("https://api.play.ht/api/v2/tts/stream", {
method: "POST",
headers: {
"Content-Type": "application/json",
AUTHORIZATION: `${process.env.PLAYHT_SECRET_KEY}`,
"X-USER-ID": process.env.PLAYHT_USER_ID!,
accept: "audio/mpeg",
},
body: JSON.stringify({
text: completion,
voice:
persona?.voiceId ??
voicesFiltered[Math.floor(Math.random() * voicesFiltered.length)]
.id,
output_format: "mp3",
voice_engine: "PlayHT2.0-turbo",
}),
}).catch((err) => console.log("fetch error:", err));
if (!resp) return;
// hack here to get around JSON keys
data.append({
voiceData: Buffer.from(await resp.arrayBuffer()).toString("base64"),
});
// IMPORTANT! you must close StreamData manually or the response will never finish.
data.close();
},
// IMPORTANT! until this is stable, you must explicitly opt in to supporting streamData.
experimental_streamData: true,
});
// Respond with the stream
return new StreamingTextResponse(stream, {}, data);
}
Use Case
For voice audio streaming alongside text AI responses. Probably many other Buffer uses as well people doing. Images, webcam streams, etc.
Additional context
No response