streams
streams copied to clipboard
Proposal: ReadableStreamBodyReader
Consuming a ReadableStream as either an ArrayBuffer, Blob, FormData, JSON, or text is a common use case. We can see this in Body mixin from fetch and the methods implemented by Request and Response, as well as in utilities such as Node.js' custom "stream consumers" documented here: https://nodejs.org/dist/latest-v18.x/docs/api/webstreams.html#utility-consumers
The pattern is so common, it would be helpful to be able to standardize the pattern with a new type of standard reader type... Specifically something like:
interface ReadableStreamBodyReader {
constructor(ReadableStream stream);
readonly attribute boolean bodyUsed;
[NewObject] Promise<ArrayBuffer> arrayBuffer();
[NewObject] Promise<Blob> blob();
[NewObject] Promise<FormData> formData();
[NewObject] Promise<any> json();
[NewObject] Promise<USVString> text();
undefined releaseLock();
}
ReadableStreamBodyReader includes ReadableStreamGenericReader;
Use example:
const readable = getReadableStreamSomehow();
const reader = readable.getReader({ mode: 'body' });
await reader.arrayBuffer();
// or
await reader.text();
// or
await reader.json();
// or
await reader.formData();
// or
await reader.blob();
Note that the ReadableStreamGenericReader could but does not just include the Body mixin. This is because the Body mixin includes the body getter that really wouldn't make sense in this case. We could use it, however, if that is the preference.
The key benefit of adding this is that it would standardize what is becoming a common pattern.
@jasnell How would reader.formData() work? It requires access to a content-type to be able to parse form data.
Couple of options there:
- Calling
formData()can assume that the content is multipart/form-data, rejecting the promise only if it cannot be successfully parsed as such, or... - The
formData()method here could be augmented to accept aninitparameter specifying atype. This would make it a departure from the Body mixin definition but achieves the goal.
Sorry, I wasn't clear. body.formData() requires the ;boundary=<boundary> parameter of the content type to parse, so parsing it without a content type is not an option. I think requiring folks use new Response(stream, { headers }).formData() is probably fine. (ie don't include form data in this proposal)
ah right, forgot about that boundary parameter on the content-type. In that case, we either do as you suggest and limit it to new Response(stream, { headers }).formData() or allow for an init argument on reader.formData() to specify the content-type. As one may imagine implementing new Response(stream, { headers }).formData() using this reader, the latter could make sense.
As one may imagine implementing
new Response(stream, { headers }).formData()using this reader
You mean as an engine, or as a user? Because new Response(stream).json() etc is already possible.
I think the crux of this proposal is if it makes sense to add another stream reader type, considering one can already do the proposed behaviour with new Response(stream).json() etc.
Succinctness alone is not a compelling argument in my opinion, because the new reader mode is more verbose than using new Response:
> "readable.getReader({ mode: 'body' })".length
36
> "new Response(readable)".length
22
Making this a proper reader can also have weird consequences. What should happen if you call releaseLock() while arrayBuffer() is still pending, and then you acquire a new reader?
const stream = new ReadableStream({ type: "bytes" });
const reader1 = stream.getReader({ mode: "body" });
reader1.arrayBuffer().catch(() => {});
// ...some time passes
reader1.releaseLock();
const reader2 = stream.getReader();
await reader2.read(); // what does this return?
Will reader2 see the full contents of stream? Or will reader1 have already consumed some bytes, such that reader2 only sees the remaining bytes?
It's also a bit weird conceptually: the Body mixin is defined on top of ReadableStream, but now you'd be able to consume a ReadableStream as a Body too? It's unclear which of the two is the "base primitive". 😛
We already proposed the idea of adding something like ReadableStream.prototype.arrayBuffer() in https://github.com/whatwg/streams/issues/1019. I think that's a better starting point, and then response.arrayBuffer() could become a shorthand for response.body.arrayBuffer(). 🙂
What should happen if you call releaseLock() while arrayBuffer() is still pending, and then you acquire a new reader?
I think in that case we'd have no choice but to error the stream if releaseLock() is called while one of these operations is pending.
Succinctness alone is not a compelling argument in my opinion, because the new reader mode is more verbose than using new Response:
No, definitely not the only argument. The goal is to standardize the pattern even in cases where Response/Request are not used. If all I want is to be able to consume a ReadableStream as a concatenated text string, I shouldn't have to create a Response object to do so. The ReadableStream.prototype.arrayBuffer() option would be a good alternative to the body reader proposal, although ReadableStream.prototype.blob() might be better as it gives more opportunity for internal optimizations.
How would
reader.formData()work? It requires access to acontent-typeto be able to parse form data.
the stream could also just read the first line and acquire the boundary from reading just some initial bytes...
What are folks' thoughts about advancing this proposal forward? I think for the FormData issue we could actually just omit formData() from this proposal.
My preference is for #1019.
Let's close this in favor of the other proposal