[BUG]: Use Upload(@aws-sdk/lib-storage) to carry md5 encountered an error: `The XML you provided was not well-formed or did not validate against our published schema`
Checkboxes for prior research
- [X] I've gone through Developer Guide and API reference
- [X] I've checked AWS Forums and StackOverflow.
- [X] I've searched for previous similar issues and didn't find any solution.
Describe the bug
When I uploaded the file and asked S3 to check the md5 of the file for me, I got an error: MalformedXML: The XML you provided was not well-formed or did not validate against our published schema
SDK version number
@aws-sdk/[email protected], @aws-sdk/[email protected],
Which JavaScript Runtime is this issue in?
Node.js
Details of the browser/Node.js/ReactNative version
node -v: v16.17.1
Reproduction Steps
const upload = new Upload({
client,
params: {
Bucket: 'test',
Key: 'demo.pdf',
// note: data size > 30 MB
Body: data,
// data md5 value
ContentMD5: 'wSunmxovn3F4x1+NV+/d1A==',
Metadata: {
'x-hash': options.hash,
},
},
});
await upload.done();
Observed Behavior
Upload failed and error found:
2023-01-03T02:14:46: MalformedXML: The XML you provided was not well-formed or did not validate against our published schema
2023-01-03T02:14:46: at throwDefaultError (/Users/skypesky/workSpaces/javascript/arcblock/did-storage/node_modules/@aws-sdk/smithy-client/dist-cjs/default-error-handler.js:8:22)
2023-01-03T02:14:46: at deserializeAws_restXmlCompleteMultipartUploadCommandError (/Users/skypesky/workSpaces/javascript/arcblock/did-storage/packages/s3-driver/node_modules/@aws-sdk/client-s3/dist-cjs/protocols/Aws_restXml.js:3086:43)
2023-01-03T02:14:46: at processTicksAndRejections (node:internal/process/task_queues:96:5)
2023-01-03T02:14:46: at async /Users/skypesky/workSpaces/javascript/arcblock/did-storage/node_modules/@aws-sdk/middleware-serde/dist-cjs/deserializerMiddleware.js:7:24
2023-01-03T02:14:46: at async /Users/skypesky/workSpaces/javascript/arcblock/did-storage/packages/s3-driver/node_modules/@aws-sdk/middleware-signing/dist-cjs/middleware.js:14:20
2023-01-03T02:14:46: at async /Users/skypesky/workSpaces/javascript/arcblock/did-storage/node_modules/@aws-sdk/middleware-retry/dist-cjs/retryMiddleware.js:27:46
2023-01-03T02:14:46: at async /Users/skypesky/workSpaces/javascript/arcblock/did-storage/node_modules/@aws-sdk/middleware-logger/dist-cjs/loggerMiddleware.js:5:22
2023-01-03T02:14:46: at async Upload.__doMultipartUpload (/Users/skypesky/workSpaces/javascript/arcblock/did-storage/packages/s3-driver/node_modules/@aws-sdk/lib-storage/dist-cjs/Upload.js:226:22)
2023-01-03T02:14:46: at async Upload.done (/Users/skypesky/workSpaces/javascript/arcblock/did-storage/packages/s3-driver/node_modules/@aws-sdk/lib-storage/dist-cjs/Upload.js:39:16)
Expected Behavior
I hope it was a successful upload
Possible Solution
No response
Additional Information/Context
S3_REGION=ap-northeast-1
related: https://github.com/aws/aws-sdk-js-v3/issues/2673
Hi @skypesky, thanks for opening this issue. I can confirm this is a bug. Seems like the exception that we get is caused by the checksum being provided, which is being sent along with each part and this checksum was calculated for the whole file content, and it needs to be calculated just for the chunk of data sent for that specific part of the file. I can also confirm that the workaround proposed here works fine, but you should remove the md5 parameter from your code. I will mark this issue for review so we can address it further.
Repro steps: Installed the following packages:
yarn add @aws-sdk/client-s3
yarn add @aws-sdk/lib-storage
I used the following code:
import {
S3Client
} from "@aws-sdk/client-s3";
import {
Upload
} from "@aws-sdk/lib-storage";
import * as crypto from "crypto";
const client = new S3Client({
region: 'us-east-2'
});
const body = '#'.repeat(1024 * 1024 * 31);
const md5 = crypto.createHash("MD5").update(body).digest("base64");
const upload = new Upload({
client: client,
params: {
Bucket: process.env.TEST_BUCKET,
Key: process.env.TEST_KEY,
Body: body,
ContentMD5: md5,
Metadata: {
'x-hash': md5,
},
},
});
const response = await upload.done();
console.log(response);
Thanks!
@yenfryherrerafeliz
Thank you very much for your reply. I have a question, after this bug is fixed, will ContentMD5 finally fill in the md5 of the entire file?
@skypesky, I do not have a final picture about how it would be, but, according to the documentation each upload part command needs to sent a checksum based in the data sent in that part specifically.
Thanks!
I can confirm we are experiencing the same here, it works perfectly on smaller files 1-2mb but as soon as you send a larger file it spits out the XML error. Watching for the final solution so we can update our code.
@andyslack what are you doing in the meantime to circumvent this issue?