deno-aws_api
deno-aws_api copied to clipboard
S3: getObject response body streaming?
I see this note on the getObject
implementation for S3
and I know #24 exists, but seems to be focused on uploading objects, not getting objects. Being able to stream objects down from S3
would be awesome.
My use case:
My S3 buckets are locked down and can't be publicly accessed, so I would like to stream an object from S3
, through my server, to the client, without needing to buffer the entire object. A workaround would be to create a presigned url for retrieving the object from s3 and the client using that, instead of my server, I just don't like exposing the underlying cloud infra, if that makes sense.
Hey, yes #24 is about request bodies. Comparatively, response bodies have almost no blockers to support streaming. It's just a question of API design. .getObject()
unconditionally returns the data as a Uint8Array
buffer, as you saw:
https://github.com/cloudydeno/deno-aws_api/blob/3ce25f2e9fb1f547bad61afdaa4676f60ddce497/lib/services/s3/mod.ts#L1027
So how should a streaming body be requested and then returned? Is changing Uint8Array
to ReadableStream<Uint8Array>
enough or should I return a whole-ass Response
object so you can also do .text()
or whatever? I haven't seen how the official AWS SDK gives streaming response bodies FWIW.
Input welcome on how to present streaming response bodies :)
A workaround would be to create a presigned url for retrieving the object from s3 and the client using that, instead of my server
You can also make a pre-signed URL and then immediately fetch that URL from the same process! So you can contain the cloud layout within the server. Still a workaround of course.
You can also make a pre-signed URL and then immediately fetch that URL from the same process! So you can contain the cloud layout within the server. Still a workaround of course.
Great point! Worth a try in the meantime.
My intuition says that a ReadableStream
would suffice, and would be more kosher with Deno, and I think less opinionated and ergo more flexible. Body
being an entire Response
object may be confusing, since technically this whole object being returned by getObject
is the "response" from S3? The caller could always instantiate a Response
themselves, around the Body
Readable Stream
, if they wanted Response
apis. Just my initial thoughts.
Looking at api from Node world:
SDK v3 S3's GetObjectCommand
resolves to a Node ReadableStream
SDK V2 buffered the response into memory which introduces same challenges discussed on this issue
My intuition says that a
ReadableStream
would suffice
This is looking like the most reasonable answer for a Deno-first library and has great synergy with Deno.writeFile()
etc, but I'm still bothered about adding one of these lines whenever grabbing e.g. configuration files from S3:
// different ways of buffering a stream:
const bodyBytes = new Uint8Array(await new Response(resp.Body).arrayBuffer());
const bodyText = await new Response(resp.Body).text();
const bodyJson = await new Response(resp.Body).json();
I'm concerned about discoverability, because it's not obvious to use new Response
. I tried searching "deno readablestream to string" etc on Google and didn't get useful results for this use-case.
I'm adding a tsdoc comment on the bodies (and a release note) and call it a day, because ReadableStream<Uint8Array>
is truthfully the correct thing:
/** To get this stream as a buffer, use `new Response(...).arrayBuffer()` or related functions. */
Body?: ReadableStream<Uint8Array> | null;
🚀 This just shipped in v0.8.0
as a breaking change
For specific result fields which are the entire contents of the response body, the returned structure will now contain the ReadableStream<Uint8Array>
instead of a buffered Uint8Array
.
Hey @danopia . Sorry, i've been caught on other things. This looks really cool though, i'm going to give it a try!
Thanks for your work on this!