The 'total' property is always undefined when uploading files larger than 5MB
Checkboxes for prior research
- [X] I've gone through Developer Guide and API reference
- [X] I've checked AWS Forums and StackOverflow.
- [X] I've searched for previous similar issues and didn't find any solution.
Describe the bug
When upload large file using stream by Upload module on webApp, httpUploadProgress event object property 'total' always undefined.
SDK version number
@aws-sdk/[email protected]
Which JavaScript Runtime is this issue in?
Node.js
Details of the browser/Node.js/ReactNative version
v18.13.0
Reproduction Steps
- setup Node.js project
- install @aws-sdk/client-s3 and @aws-sdk/lib-storage
- prepare upload file (larger than 5 MB)
- run the following program
- may see ev.total always undefined
const fs = require("fs");
const stream = require("stream");
const S3Client = require("@aws-sdk/client-s3").S3Client;
const Upload = require("@aws-sdk/lib-storage").Upload;
const s3 = new S3Client({
// your s3 settings
});
async function main() {
// Use a file larger than 5MB.
const fileStream = fs.createReadStream("/path/to/your/over5mb/file.dat");
const passThrough = new stream.PassThrough();
// Pipe PassThrough to remove fileStream.path
// to emulate browser file upload.
fileStream.pipe(passThrough);
const upload = new Upload({
client: s3,
params: {
Bucket: // your Bucket,
Key: // your Key,
Body: passThrough,
},
});
upload.on("httpUploadProgress", function (ev) {
// ev.total always undefined
console.log(ev);
});
await upload.done();
}
main();
By debugging the code, no assignment for this.totalBytes( references of 'total') anywhere excepted in constructor and function for one chunk
Observed Behavior
httpUploadProgress event object property 'total' always undefined. (In sample code, console output of ev.total is undefined to the end )
Expected Behavior
finally set the correct total file size (In sample code, final console output of ev.total is the correct total file size)
Possible Solution
No response
Additional Information/Context
aws-sdk v2 was recalculating this.totalBytes on readable event
https://github.com/aws/aws-sdk-js/blob/b168eaab7aa04f8fc300b38bf67f9c26bf02c28a/lib/s3/managed_upload.js#L188
Hi @ti2sugiyama, thanks for opening this issue. I can confirm the reported behavior. I will investigate this further in order to define further steps on this.
Thanks!
@ti2sugiyama what you say in the comment below is true:
By debugging the code, no assignment for this.totalBytes( references of 'total') anywhere excepted in constructor and function for one chunk
I feel this issue just happens when the body given as parameter does not have any of the properties evaluated in the conditions here. As of right now I will mark this issue to be reviewed so it can be further addressed.
Steps to reproduce:
- Create an empty node project:
npm init -y - Add the dependencies:
npm i @aws-sdk/client-s3npm i @aws-sdk/lib-storage
- Then, create an empty file. Ex:
src/index.js - Copy and paste into the file created above, the following code:
import {S3Client} from "@aws-sdk/client-s3";
import {Upload} from "@aws-sdk/lib-storage";
import * as fs from "fs";
import * as stream from "stream";
const client = new S3Client({
region: process.env.TEST_REGION,
})
const pathToFile = "./test-file.txt"
const data = "#".repeat(1024 * 1024 * 10);
fs.writeFileSync(pathToFile, data);
const fileStream = fs.createReadStream(pathToFile)
const passThrough = new stream.PassThrough();
fileStream.pipe(passThrough)
const upload = new Upload({
client: client,
params: {
Bucket: process.env.TEST_BUCKET,
Key: process.env.TEST_KEY,
Body: passThrough,
}
})
upload.on('httpUploadProgress', (progress) => {
console.log('Progress: ', progress)
})
await upload.done();
Thanks!
Same issue. It's been a year, hopefully fixed soon.
Facing the same issue here. @yenfryherrerafeliz suggested this approach in https://github.com/aws/aws-sdk-js-v3/discussions/5165, maybe there should be a note that it doesn't work for files larger than 5MB?
Also, is using max of all ev.loaded instead of ev.total a reasonable workaround for now? Thanks!
Any update on this?
Hello!
I managed to check every data type regarded as "accepted" within the Upload class: string | Uint8Array | Buffer | Readable | ReadableStream | Blob.
The findings:
-
The only working type parameters reporting the totalBytes in a multipart upload are the non-stream ones:
string | Uint8Array | Buffer | Blob. -
When using streams, it works fine for files of a size smaller than the
partSizethat was defined on initialization. If the size is larger than that, you get anundefinedtotal. This is because the stream is splitted to multi-parts as intended and it is the precise point where things get messy.-
This behavior is consistent with the
__doConcurrentUploadexecution code checking if the upload has only 1 part to be uploaded. -
Then, the
__uploadUsingPutexecutes only when the file size is smaller than the partSize to be splitted (minimum of 5 MB if none was defined). When reporting the progress, it doesn't take the private class memberthis.totalBytesupon the__notifyProgresscall. Instead, the size is calculated from the "dataPart" parameter which was converted to Buffer type previously.
-
-
What @yenfryherrerafeliz suggested is aligned with the idea that only those types are correctly managed upon initialization:
I feel this issue just happens when the body given as parameter does not have any of the properties evaluated in the conditions here. As of right now I will mark this issue to be reviewed so it can be further addressed.
This is partially true, because the issue is strongly related to the Buffer conversion step, here and here. If the conversion to Buffer had been performed previously to defining the totalBytes in the initialization step here, it should work.
On a multipart upload, the __notifyProgress consumes the this.totalBytes value defined on initialization. This can be referenced here. And there you go: undefined as the value. The byteLength function is not intended to support streams, but works fine using just Buffer, Uint8Array or String.
I believe that a good solution can be achieved by strengthening the types first, replacing the "any" here which in fact is the real cause of the bug: a weak type check allowing for multiple incompatible inputs for this function. Matching type with this type definition should help. Secondly, additional code has to be added inside that function for streams on initialization. These strategies: 1 and 2 might be reutilized to that end.
I hope this helps towards a definitive solution 👍
Hey all,
I am able to replicate the issue when working with a file of 11 megabytes in size. The relevant package versions being utilized are:
"@aws-sdk/client-s3": "^3.614.0"
"@aws-sdk/lib-storage": "^3.614.0"/ "@aws-sdk/[email protected]"
The observed result during the file upload process is as follows:
Progress: {
loaded: 2154984,
total: undefined,
part: 3,
Key: 'imag_older_version',
Bucket: 'new-bucket-maggie-ma'
}
Progress: {
loaded: 7397864,
total: undefined,
part: 1,
Key: 'imag_older_version',
Bucket: 'new-bucket-maggie-ma'
}
Progress: {
loaded: 12640744,
total: undefined,
part: 2,
Key: 'imag_older_version',
Bucket: 'new-bucket-maggie-ma'
}
Please note that sensitive information, such as bucket names, has been replaced with placeholders for privacy and security purposes.
Steps to reproduce:
- Create an empty node project: npm init -y
- Add the dependencies:
npm i @aws-sdk/client-s3npm i @aws-sdk/lib-storage - Then, create an empty file. Ex: src/index.js
- add a file which is over 5 mb, run the code :
import { S3Client } from "@aws-sdk/client-s3";
import { Upload } from "@aws-sdk/lib-storage";
import * as fs from "fs";
import * as stream from "stream";
const client = new S3Client({
// your region
});
const fileStream = fs.createReadStream("./your_file_name");
const passThrough = new stream.PassThrough();
fileStream.pipe(passThrough);
const upload = new Upload({
client: client,
params: {
Bucket: "your_bucket_name",
Key: "your_key",
Body: passThrough,
},
});
upload.on("httpUploadProgress", (progress) => {
console.log("Progress: ", progress);
});
await upload.done().then((x) => {
console.log("Upload done", x);
});
@aagenesds22 Regarding adding strict type suggestion - I understand your point. Since this is not a public API, strict typing may not be a necessity. Additionally, introducing dependencies on platform-specific types, especially those from the global namespace, can potentially lead to issues and complications.
I appreciate you taking the time to clarify this point and provide additional context. I have brought this up to the AWS SDK JS team, it's queued now.
Thanks! Maggie