deno-aws_api
deno-aws_api copied to clipboard
WIP: Streaming request bodies (for S3)
Turns out this is pretty tough API-wise because S3 uploads must have a content-length header, so the only way to 'stream' an upload of truely unknown length is to initiate a multi-part upload.
+1 for this api. Right now, I am using putObject which accepts a Buffer or more broadly speaking, a Uint8Array, but I have to read the whole file into the buffer and send it.
With a streaming api, similar to AWS SDK's upload, I could pipe a request body directly to s3, which would greatly increase performance.
With a streaming api, similar to AWS SDK's
upload, I could pipe a request body directly to s3, which would greatly increase performance.
Thank you for reporting your use-case!
So looking at upload(), that specific function is implemented by the AWS.S3.ManagedUpload class. It appears to chop your stream into individual 5MB segments and upload them with a Multipart Upload strategy. This is actually separate from streaming request bodies because each 'part' is buffered. So I will track managed/multipart uploads in a separate issue 😅
S3 only supports true streaming uploads if you will know the length of the body upfront. That's what this PR ⬆️ implements. Maybe you know your object size ahead of time, in which case you don't actually need the multipart upload() (but the parallelization probably still helps speeds with huge objects)
@TillaTheHun0 I have a working pass on chunked/parallelized S3 uploading in #31, please feel free to vet the behavior in your own stuff before I get it clean enough to merge. You can import { multiPartUpload } from "https://raw.githubusercontent.com/cloudydeno/deno-aws_api/89703336482f008f3f6ebd9f759370fd393ba362/lib/helpers/s3-upload.ts" and then call it with an S3 client as shown in #31's text. If you have any feedback please comment on that PR. Thanks again!
@danopia oh sweet, I will check it out! 👍
+1 for this api. Right now, I am using
putObjectwhich accepts aBufferor more broadly speaking, aUint8Array, but I have to read the whole file into the buffer and send it.With a streaming api, similar to AWS SDK's
upload, I could pipe a request body directly to s3, which would greatly increase performance.
@TillaTheHun0 May I know how to supply a Buffer to putObject? Looking at the s3.file, I could only supply either one of the following values:
- Uint8Array
- string
@yogesnsamy the code is here
Basically we're using readAllwhich accepts a Deno.Reader and reads the contents into a Uint8Array that we pass to putObject
@yogesnsamy the code is here
Basically we're using
readAllwhich accepts aDeno.Readerand reads the contents into aUint8Arraythat we pass toputObject
Many thanks @TillaTheHun0. It works.
Question for y'all, when calling PutObject with a Reader, do you know the byte size of your Reader upfront? That would make streaming easier to implement
🚀 A managed-upload module (using S3 multipart) just shipped in v0.8.0. This is most useful for uploading large files (50MB and up) as it will break up your ReadableStream<Uint8Array> into multiple S3 requests.
🗒️ Also, hot tip: in cases where you aren't worried about the file fitting into memory, you can use a Response to easily buffer a ReadableStream<Uint8Array>:
const bodyBuffer = new Uint8Array(await new Response(bodyStream).arrayBuffer());
🗑️ This pull request is still in draft and I don't think it's going to land this time because it's not very useful. True request streaming is only possible to S3 if the body's length is known upfront. That limitation means that this library cannot just accept a ReadableStream<Uint8Array> on its own. The library could buffer the stream up for you, but this would really be a lie, so I'm not immediately in favor of it 🤔