aws-sdk-js-v3
aws-sdk-js-v3 copied to clipboard
S3 commands hang after uploading an empty stream using `@aws-sdk/lib-storage`
Checkboxes for prior research
- [X] I've gone through Developer Guide and API reference
- [X] I've checked AWS Forums and StackOverflow.
- [X] I've searched for previous similar issues and didn't find any solution.
Describe the bug
Immediately ending the stream feeding the Upload utility from @aws-sdk/lib-storage sometimes leads to the S3 client getting "locked up": a subsequent command sent through it never finishes.
SDK version number
@aws-sdk/[email protected], @aws-sdk/[email protected]
Which JavaScript Runtime is this issue in?
Node.js
Details of the browser/Node.js/ReactNative version
v20.15.0
Reproduction Steps
Detailed steps to reproduce are listed in https://github.com/martinslota/aws-sdk-lib-storage-race. Those steps run against Localstack, but the same code also works against actual AWS S3.
Observed Behavior
After uploading an empty stream, commands sent through the S3 client never finish.
Expected Behavior
After uploading an empty stream, commands sent through the S3 client should continue to work as usual.
Possible Solution
No response
Additional Information/Context
No response
Hi @martinslota - thanks for reaching out.
Before proceeding, I'd like to request your assistance in confirming that the issue persists when running the code against the actual AWS service as the example you shared appears to be using a LocalStack endpoint (https://s3.localhost.localstack.cloud:4566). While LocalStack is a great tool for local development and testing, it's always a good practice to validate the behavior against the real AWS service before raising a formal issue.
Additionally, to better understand the underlying interactions between your code and the AWS SDK, please add middlewareStack to S3 client in your code to get request logs. This middleware will log the context, request input and response output for each AWS SDK operation, providing valuable insights into the data being sent and received.
client.middlewareStack.add(
(next, context) => async (args) => {
console.log("AWS SDK context", context.clientName, context.commandName);
console.log("AWS SDK request input", args.input);
const result = await next(args);
console.log("AWS SDK request output:", result.output);
return result;
},
{
name: "MyMiddleware",
step: "build",
override: true,
}
);
Looking forward to your response!
Hi @martinslota - thanks for reaching out.
👋
Before proceeding, I'd like to request your assistance in confirming that the issue persists when running the code against the actual AWS service as the example you shared appears to be using a LocalStack endpoint (https://s3.localhost.localstack.cloud:4566). While LocalStack is a great tool for local development and testing, it's always a good practice to validate the behavior against the real AWS service before raising a formal issue.
Yes, the same happens against the actual AWS S3 service. I also adapted https://github.com/martinslota/aws-sdk-lib-storage-race so it runs against S3 by default.
Additionally, to better understand the underlying interactions between your code and the AWS SDK, please add middlewareStack to S3 client in your code to get request logs. This middleware will log the context, request input and response output for each AWS SDK operation, providing valuable insights into the data being sent and received.
client.middlewareStack.add( (next, context) => async (args) => { console.log("AWS SDK context", context.clientName, context.commandName); console.log("AWS SDK request input", args.input); const result = await next(args); console.log("AWS SDK request output:", result.output); return result; }, { name: "MyMiddleware", step: "build", override: true, } );Looking forward to your response!
I added the code above but additionally redacted the Body field from response output. The output looks like this.
Previously I also tried with an hour-long timeout instead of 5 seconds and I still saw the script fail.
I'm experiencing this as well
Hey @aBurmeseDev sorry for the direct ping, do you know where the can problem might be? I can have a look, currently this issue is making requests randomly hang around 1% with high traffic
I have run the reproduction repo that @martinslota kindly provided and indeed it also happens locally after around 5k uploads in my case I tested with MinIO so it's definitely a client side issue
This happens also with the AWS S3 Service
Hi @fenos - this's interesting! I haven't been able to reproduce it on my end with S3 itself but haven't tested with MinIO. I'll have to dig deeper into it. If there's any other specific case or pattern that I need to be aware of, please do share.
Yeah, it is really random, I don't really have a reliable way of reproducing it yet
Sometimes the done promise is never resolved from the Upload class, no requests are sent by the SDK for uploading.
I have otel telemetry on the aws SDK and added custom span in the uploader, and it simply hangs on the done without even sending any request. I have an idle timeout of 60 seconds that kicks in which then closes the request socket.
I can confirm that it is not something to do with the default agent option maxSockets of 50, since I've increased that number to 700 and instrumented the Agent. Free sockets stay at 60-75 whereas busy sockets are 20-30 at any given time.
I have tried with a couple of different setups for the Agent, using native and others. Currently, I'm using agentkeepalive which simply extends the native Agent with some better connection handling.
in my app I pretty much pass the request body as a stream to the Uploader
const s3Client = new S3Client({
...credentials,
requestHandler: new NodeHandler({
httpAgent: new Agent({
maxSockets: 700,
freeSocketTimeout: 15000
})
})
fastify.post('/upload', async (req, reply) => {
const upload = new Upload({
client: s3Clients,
params: {
...bucketAndKey,
// body as Readable stream http.IncomingMessage
Body: req.raw
}
})
await upload.done()
return reply.send('OK')
})
Files are particularly small ~1MB. As soon as I get more findings I'll share, currently adding as much telemetry as possible
This issue only occurs with the Uploader
Appreciate the details. Please do share once you have more findings while I further look into it on my end.
Hello @aBurmeseDev I've added more debugging spans and here is what's happening:
The metadata I gathered is the following:
Request Content-Lenght: 320,817 bytes
{
# Bytes uploaded from the Upload class
"bytesUploaded":0,
# Passthrough proxy Stream information
"dataStream":{
"closed":false,
"paused":false,
"errored":false,
"writable":true,
"byteRead":130853
},
# custom HttpAgent pool count
"httpAgentStats":{
"busySocketCount":0,
"freeSocketCount":1,
"pendingRequestCount":0,
"errorSocketCount":0,
"timeoutSocketCount":0,
"createSocketErrorCount":0
},
# Parts uploaded
"progress": []
}
Telemetry
-
Use a passthrough stream to read from the request stream and count the bytes read (just to make sure the client is actually sending the body)
-
Listening for the httpUploadProgress from the SDK to determine how many bytes were uploaded
-
Printing the httpPool sockets information from the HTTP agent passed in the
NodeHandler
The code looks like this:
async function uploadObject(
bucketName: string,
key: string,
body: Readable,
signal?: AbortSignal
): Promise<void> {
if (signal?.aborted) {
throw ERRORS.Aborted('Upload was aborted')
}
// Create a Proxy Stream
const passThrough = new PassThrough()
if (signal) {
addAbortSignal(signal, passThrough)
}
passThrough.on('error', () => {
body.unpipe(passThrough)
})
body.on('error', (err) => {
if (!passThrough.closed) {
passThrough.destroy(err)
}
})
const byteReader = createByteCounterStream()
const bodyStream = body.pipe(passThrough)
// Watch upload stream
let upload: Upload | undefined = undefined
// Upload stats
const uploadProgress: Progress[] = []
// Get current stream status
const getStreamStatus = (): StreamStatus => ({
time: new Date(),
bytesUploaded: uploadProgress[uploadProgress.length - 1]?.loaded || 0,
dataStream: {
closed: bodyStream.closed,
paused: bodyStream.isPaused(),
errored: Boolean(bodyStream.errored),
writable: bodyStream.writable,
byteRead: byteReader.bytes,
},
httpAgentStats: gatherHttpAgentStats(this.agent.httpsAgent.getCurrentStatus()),
progress: uploadProgress,
})
let streamStatus = getStreamStatus()
const streamWatcher = setInterval(() => {
streamStatus = getStreamStatus()
}, 1000)
// Upload the stream
try {
const data = await stream.pipeline(
bodyStream,
byteReader.transformStream,
async (bodyStream) => {
if (signal?.aborted) {
throw ERRORS.Aborted('Upload was aborted')
}
upload = new Upload({
client: this.client,
params: {
Bucket: bucketName,
Key: key,
Body: bodyStream as Readable,
},
})
upload.on('httpUploadProgress', (progress) => {
uploadProgress.push({
total: progress.total,
part: progress.part,
loaded: progress.loaded,
})
if (uploadProgress.length > 100) {
uploadProgress.shift()
}
})
signal?.addEventListener(
'abort',
() => {
upload?.abort()
},
{ once: true }
)
return await upload.done()
},
{ signal }
)
} catch (err: any) {
if (err instanceof Error && err.name === 'AbortError') {
const span = trace.getActiveSpan()
if (span) {
// Print how far we got uploading the file
const lastStreamStatus = getStreamStatus()
const { progress, ...lastSeenStatus } = streamStatus
span.setAttributes({
lastStreamStatus: JSON.stringify(lastStreamStatus),
lastSeenStatus: JSON.stringify(lastSeenStatus),
})
}
}
throw err
} finally {
clearInterval(streamWatcher)
}
}
As you can see, the request sometimes gets stuck on the __doConcurrentUpload method even though there is data in the stream. I have a timeout of 60s which abort the signal if the socket is idling
The server is running on ECS Fargate.
I'm not sure where else I can trace this down further, I hope this helps to track down the issue. If not please let me know where else I should be adding telemetry.
Commenting to keep this thread alive as I am also experiencing this issue.
+1
@aBurmeseDev just bumping this up, as it is still happening with the latest version of the SDK. It happens consistently to me, for example, I'm uploading 100 empty files concurrently using the Upload class (also tried with putObject same issue), after the upload, we send a HeadObject request for each. Around 36 succeed (upload + head), and the remaining hang indefinitely on the HeadRequest.
The maxSockets on my S3Client is set to 700, so there must be something blocking elsewhere in the SDK class and not in the http agent.
I'll debug this further again, hopefully I can get into the problem
also related https://github.com/aws/aws-sdk-js-v3/issues/7165
I've added the following middleware to determine if the request hangs and indeed it does:
this.client.middlewareStack.add(
(next, context) => async (args) => {
const r = setTimeout(() => {
console.log('COMMAND HANGS', {
clientName: context.clientName,
commandName: context.commandName,
input: args.input,
stats: this.agent.httpsAgent.getCurrentStatus(),
})
}, 10000)
const result = await next(args)
clearTimeout(r)
return result
},
{
name: 'MyMiddleware',
step: 'build',
override: true,
}
)
receiving the output:
COMMAND HANGS, {
clientName: 'S3Client',
commandName: 'HeadObjectCommand',
input: {
Bucket: '[Redacted]',
Key: 'bjhaohmqunupljrqypxz/empty-bucket-2/aa0d/aa01/aa00/aa00/aaaaa-test-4/a-file-2025-08-01T13-36-21-323Z-1754055390089-0.8331603008948798/.somethingElse/7f087fe7-2f3f-4a7f-9a71-e3443532a059'
},
stats: {
createSocketCount: 101,
createSocketErrorCount: 0,
closeSocketCount: 101,
errorSocketCount: 0,
timeoutSocketCount: 0,
requestCount: 606,
freeSockets: {},
sockets: {},
requests: {}
}
}
Hi folks, this is Trivikram - one of the maintainers of AWS SDK for JavaScript (v3).
I looked into the discussions since the first report, and here is a summary from my understanding:
- It's the SDK operation call which hangs, likely after the build step as per https://github.com/aws/aws-sdk-js-v3/issues/6426#issuecomment-3144633409
- It's related to content of the string. In one case, a empty files were uploaded. In another case, an ongoing stream for an upload was ended.
I was not able to reproduce https://github.com/martinslota/aws-sdk-lib-storage-race even after uploading 1000+ objects
Export finished
AWS SDK context S3Client PutObjectCommand
AWS SDK request input {
Bucket: 'test-js-6426',
Key: 'object_1110',
Body: <Buffer 73 6f 6d 65 20 64 61 74 61>
}
I'll continue looking into in to check if a consistent minimal repro can be created. If you come across a minimal repro, please share it. Or if existing minimal repro works for you, do share if any specific steps needs to be followed.
From the provided data, it appears that the issue is not specific to Upload utility in lib-storage. It's unique to HeadObject request after a PutObject operation was complete. In the repro which folks are able to reproduce, PutObject call was made for an empty file.
We don't know where the issue is. The question which needs answer is if the SDK make http request for HeadObject call?
- For folks who used LocalStack in their repro, was the request registered in your local backend?
- For folks able to reproduce with S3, we might want to override NodeHttpHandler itself to check the status of
.handlecall.
@trivikr I just revived https://github.com/martinslota/aws-sdk-lib-storage-race a little bit and improved the instructions, both against Localstack and against AWS. In case they do not work for you, please do ping me - I'm happy to improve them further.
For me the reproduce still works fine.
For folks who used LocalStack in their repro, was the request registered in your local backend?
The last request log I see in Localstack's logs is a successful response to the GetObject command after which the script hanged. This is to say that when the tail of the script output looks like this:
Downloading batch with key object_282...
AWS SDK context S3Client GetObjectCommand
AWS SDK request input { Bucket: 'martin-slota-test', Key: 'object_282' }
file:///Users/martin.slota/Code/aws-sdk-lib-storage-race/node_modules/p-timeout/index.js:70
const timeoutError = new TimeoutError();
^
TimeoutError: Promise timed out after 5000 milliseconds
at file:///Users/martin.slota/Code/aws-sdk-lib-storage-race/node_modules/p-timeout/index.js:70:24
at new Promise (<anonymous>)
at pTimeout (file:///Users/martin.slota/Code/aws-sdk-lib-storage-race/node_modules/p-timeout/index.js:48:25)
at file:///Users/martin.slota/Code/aws-sdk-lib-storage-race/index.js:123:9
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
Node.js v20.15.0
the last request log from Localstack looks like this:
2025-08-13 19:50:52 2025-08-13T17:50:52.210 INFO --- [et.reactor-0] localstack.request.aws : AWS s3.GetObject => 200; 000000000000/eu-west-1; GetObjectRequest({'Bucket': 'martin-slota-test', 'IfMatch': None, 'IfModifiedSince': None, 'IfNoneMatch': None, 'IfUnmodifiedSince': None, 'Key': 'object_282', 'Range': None, 'ResponseCacheControl': None, 'ResponseContentDisposition': None, 'ResponseContentEncoding': None, 'ResponseContentLanguage': None, 'ResponseContentType': None, 'ResponseExpires': None, 'VersionId': None, 'SSECustomerAlgorithm': None, 'SSECustomerKey': None, 'SSECustomerKeyMD5': None, 'RequestPayer': None, 'PartNumber': None, 'ExpectedBucketOwner': None, 'ChecksumMode': None}, headers={'x-amz-user-agent': 'aws-sdk-js/3.637.0', 'user-agent': 'aws-sdk-js/3.637.0 ua/2.0 os/darwin#24.6.0 lang/js md/nodejs#20.15.0 api/s3#3.637.0', 'Host': 'martin-slota-test.s3.localhost.localstack.cloud:4566', 'amz-sdk-invocation-id': '131436c8-7fa8-42b1-b7d2-1b30bb856b35', 'amz-sdk-request': 'attempt=1; max=3', 'x-amz-date': '20250813T175052Z', 'x-amz-security-token': 'IQoJb3JpZ2luX2VjEOr//////////wEaCWV1LXdlc3QtMSJGMEQCIGIP2pLZzkfGBQDdwiyBT0SAvH4dUc0y6AS5lknoEq+hAiB3d0rdOoNkbKlarhzQqe8GEk8klsDf//9LVxdSkhiOWSrLAggyEAIaDDUzMTc3NjM3MjQ5MyIMrR3p9nUnH3ACLRrcKqgCcOgHUgv/avK8FYUNTqLvVYKCGy0ljXNzxyiwVDrcKoNPav//dwNmRy5abZnSd0tJ/rO80GVFXRN0EnLQSkXw3FHDnfAnY/r9FFDrq0LjK1n7FyPihVvh3TDXSdYsct/8bftmdU4H8fQrlofSap71IY/1HTG8EbwX0vejl4dZ3R6I097WyDiWtcz1EPrJmjer4IKiTW5xCrBb1OFZTgj5uTjcqkMhSWtpBYWN9fXKTZkdl+32ansLMyCgCFSNZfq/dsGK1n/0JcDzzeGmkjRbAuD9povVeJekbHd1OLL0vtvZl2ensvDHtsu1JBgJlvOC+nVtCcflmBsvtdjFls8X61Org79Q4fC6dd3BYHaeHsW4tFhZDbSHar5oH6cc1cLV4iICYiVrfGkwzI7zxAY6ngHWQQzBAHqKxggtbnboxxiWRa4JiWz8FABN5UrL3rYlX7RelteWByL1ekSE5iP9jSVeanOpzhgVOTMdxDNqgn8Fp954EgS4RL5LXfwOFFYd8E+hRGluSfiIFKK2C9dmtRTkf2Xw2oKNXy46KfHrEPGFFzwnupL0RZs2OC15J2HlQKsstZPeMmC551QT86TKg2XOpOqmfN3gyPO27Ei0wg==', 'x-amz-content-sha256': 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'authorization': 'AWS4-HMAC-SHA256 Credential=ASIAXXUCXP4GUWFQQ3NQ/20250813/eu-west-1/s3/aws4_request, SignedHeaders=amz-sdk-invocation-id;amz-sdk-request;host;x-amz-content-sha256;x-amz-date;x-amz-security-token;x-amz-user-agent, Signature=22b49cddfd79d02194d51a69da6a461f32e9317322e8371e1bc089ddfb80dc58', 'Connection': 'keep-alive', 'x-moto-account-id': '000000000000'}); GetObjectOutput({'AcceptRanges': 'bytes', 'LastModified': 'Wed, 13 Aug 2025 17:50:52 GMT', 'ContentLength': '9', 'ETag': '"1e50210a0202497fb79bc38b6ade6c34"', 'ContentType': 'application/octet-stream', 'Body': <localstack.services.s3.storage.ephemeral.EphemeralS3StoredObject object at 0xffff7f7b3bd0>, 'ServerSideEncryption': <ServerSideEncryption.AES256: 'AES256'>}, headers={'Content-Type': 'application/octet-stream', 'accept-ranges': 'bytes', 'Last-Modified': 'Wed, 13 Aug 2025 17:50:52 GMT', 'Content-Length': '9', 'ETag': '"1e50210a0202497fb79bc38b6ade6c34"', 'x-amz-server-side-encryption': <ServerSideEncryption.AES256: 'AES256'>, 'x-amz-request-id': 'b5b872a3-51f9-4ba5-922b-0ee6f1d5f0f2', 'x-amz-id-2': 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234='})
For folks able to reproduce with S3, we might want to override NodeHttpHandler itself to check the status of .handle call.
I also added a custom request handler that logs response status and headers. The tail of the script's output against AWS looks like this for me:
AWS SDK context S3Client PutObjectCommand
AWS SDK request input { Bucket: 'martin-slota-test', Key: 'object_91', Body: <Buffer > }
AWS SDK HTTP Response: {
statusCode: 200,
headers: {
'x-amz-id-2': 'aJ3FyEeXR/PTREvV3H5WswNO1m1/t1YnBuVxbYT4NBhLX2IlEbbRB2Zdv1S/9s3djWOk/kF9aw8=',
'x-amz-request-id': '2P86T7T6FP525N8V',
date: 'Wed, 13 Aug 2025 18:02:27 GMT',
'x-amz-server-side-encryption': 'AES256',
etag: '"d41d8cd98f00b204e9800998ecf8427e"',
'x-amz-checksum-crc64nvme': 'AAAAAAAAAAA=',
'x-amz-checksum-type': 'FULL_OBJECT',
'content-length': '0',
server: 'AmazonS3'
}
}
AWS SDK request output: {
'$metadata': {
httpStatusCode: 200,
requestId: '2P86T7T6FP525N8V',
extendedRequestId: 'aJ3FyEeXR/PTREvV3H5WswNO1m1/t1YnBuVxbYT4NBhLX2IlEbbRB2Zdv1S/9s3djWOk/kF9aw8=',
cfId: undefined,
attempts: 1,
totalRetryDelay: 0
},
ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
ServerSideEncryption: 'AES256',
Body: 'REDACTED'
}
Downloading batch with key object_90...
AWS SDK context S3Client GetObjectCommand
AWS SDK request input { Bucket: 'martin-slota-test', Key: 'object_90' }
file:///Users/martin.slota/Code/aws-sdk-lib-storage-race/node_modules/p-timeout/index.js:70
const timeoutError = new TimeoutError();
^
TimeoutError: Promise timed out after 5000 milliseconds
at file:///Users/martin.slota/Code/aws-sdk-lib-storage-race/node_modules/p-timeout/index.js:70:24
at new Promise (<anonymous>)
at pTimeout (file:///Users/martin.slota/Code/aws-sdk-lib-storage-race/node_modules/p-timeout/index.js:48:25)
at file:///Users/martin.slota/Code/aws-sdk-lib-storage-race/index.js:142:9
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
Node.js v20.15.0
Hi @martinslota , thank you for reviving https://github.com/martinslota/aws-sdk-lib-storage-race
I was able to reproduce the SDK hanging after object_36.
My logs indicate that (similar to yours but for object_91):
- SDK sends GetObjectRequest for
object_36 - The server (LocalStack) returns 200 response for
object_36 - The test code times out in 5s before the request is processed.
To ensure that timeout doesn't play any role in this repro, I increased it to 50s. Even though server immediately sent a response, the repro waited for 50s before timing out.
This means there's no issue on the server side. The issue is likely in Node.js runtime http module which is not returning response, or in the SDK NodeHttpHandler which is not processing the response from the server. If it's former, we need to check if any http configuration can be changed to avoid it, or report it upstream. If it's latter, then it can be fixed in the SDK itself.
I'll debug it a little more. Here are the contents of the handle function if you want to deep dive in the mean time.
While diving deep, I removed all logging statements and just introduced two which log request method and path before sending request and after receiving response as follows:
console.log(`sending request: ${request.method} for ${request.path}`);
const response = await this.innerHandler.handle(request, options);
console.log(`received response: ${request.method} for ${request.path}`);
I ran the minimal repro by seeing NODE_DEBUG=http
Here are the logs for successful GET
HTTP 44534: have free socket
HTTP 44534: outgoing message end.
HTTP 44534: AGENT incoming response!
received response: GET for /object_273
HTTP 44534: AGENT socket keep-alive
HTTP 44534: CLIENT socket onFree
HTTP 44534: agent.on(free) test-js-6426.s3.localhost.localstack.cloud:4566:::::::::::::::::::::
HTTP 44534: removeSocket test-js-6426.s3.localhost.localstack.cloud:4566::::::::::::::::::::: writable: true
sending request: GET for /object_274
HTTP 44534: have free socket
HTTP 44534: outgoing message end.
HTTP 44534: AGENT incoming response!
received response: GET for /object_274
HTTP 44534: AGENT socket keep-alive
HTTP 44534: CLIENT socket onFree
HTTP 44534: agent.on(free) test-js-6426.s3.localhost.localstack.cloud:4566:::::::::::::::::::::
HTTP 44534: removeSocket test-js-6426.s3.localhost.localstack.cloud:4566::::::::::::::::::::: writable: true
Here are the logs for GET which hangs
sending request: PUT for /object_277
HTTP 44534: have free socket
HTTP 44534: AGENT incoming response!
HTTP 44534: AGENT incoming response!
HTTP 44534: write ret = true
HTTP 44534: outgoing message end.
received response: PUT for /object_277
HTTP 44534: AGENT socket keep-alive
HTTP 44534: CLIENT socket onFree
HTTP 44534: agent.on(free) test-js-6426.s3.localhost.localstack.cloud:4566:::::::::::::::::::::
HTTP 44534: removeSocket test-js-6426.s3.localhost.localstack.cloud:4566::::::::::::::::::::: writable: true
sending request: GET for /object_276
HTTP 44534: have free socket
HTTP 44534: outgoing message end.
HTTP 44534: AGENT socket keep-alive
HTTP 44534: CLIENT socket onFree
HTTP 44534: agent.on(free) test-js-6426.s3.localhost.localstack.cloud:4566:::::::::::::::::::::
HTTP 44534: removeSocket test-js-6426.s3.localhost.localstack.cloud:4566::::::::::::::::::::: writable: true
I don't see the following log
HTTP 44534: AGENT incoming response!
That's indicating that JS SDK is not receiving response from server, although it's sent by the backend (verified from LocalStack logs).
I played around with the repro further in my fork https://github.com/trivikr/aws-sdk-lib-storage-race
Noticed two things:
- The issue is not reproducible with PutObject, and is likely unique to Upload in lib-storage. Tested in commit https://github.com/trivikr/aws-sdk-lib-storage-race/commit/8c16e698eaf7864ddf3488e636cf7a0c3596e215
- The issue is reproducible if I just upload an empty stream and download it in succession. Tested in commit https://github.com/trivikr/aws-sdk-lib-storage-race/commit/c7801e030ed2e0fecbbbd955cc049f4ee2853c64
To find out if an issue with lib-storage, I logged the feed part which was yielded
File: node_modules/@aws-sdk/lib-storage/dist-cjs/index.js
// src/chunks/getChunkStream.ts
async function* getChunkStream(data, partSize, getNextData) {
// ...
const toYield = {
partNumber,
data: currentBuffer.chunks.length !== 1 ? import_buffer.Buffer.concat(currentBuffer.chunks) : currentBuffer.chunks[0],
lastPart: true
};
console.log({ yield: toYield, chuckLength: toYield.data.byteLength });
yield toYield;
}
No difference between upload for getObject call which was hung vs what was successful
{
yield: { partNumber: 1, data: <Buffer >, lastPart: true },
chuckLength: 0
}
sending request: PUT for /object_26
received response: PUT for /object_26
sending request: GET for /object_26
received response: GET for /object_26
{
yield: { partNumber: 1, data: <Buffer >, lastPart: true },
chuckLength: 0
}
sending request: PUT for /object_27
received response: PUT for /object_27
sending request: GET for /object_27
Another deep dive with lib-storage is to log when PutObject call is succeessful
File: node_modules/@aws-sdk/lib-storage/dist-cjs/index.js
async __uploadUsingPut(dataPart) {
var _a;
// ...
const totalSize = byteLength(dataPart.data);
console.log({
singleUploadResult: this.singleUploadResult,
totalSize,
});
this.__notifyProgress({
// ...
}
No difference between upload for getObject call which was hung vs what was successful
LocalStack
sending request: GET for /object_14
received response: GET for /object_14
sending request: PUT for /object_15
received response: PUT for /object_15
{
singleUploadResult: {
'$metadata': {
httpStatusCode: 200,
requestId: '1b850ab9-ac73-4591-af54-591033510c14',
extendedRequestId: 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
cfId: undefined,
attempts: 1,
totalRetryDelay: 0
},
ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
ServerSideEncryption: 'AES256',
Bucket: 'test-js-6426',
Key: 'object_15',
Location: 'https://test-js-6426.s3.localhost.localstack.cloud:4566/object_15'
},
totalSize: 0
}
sending request: GET for /object_15
received response: GET for /object_15
sending request: PUT for /object_16
received response: PUT for /object_16
{
singleUploadResult: {
'$metadata': {
httpStatusCode: 200,
requestId: '7c7452cc-18a6-4ee7-a17f-ba530e862359',
extendedRequestId: 's9lzHYrFp76ZVxRcpX9+5cjAnEH2ROuNkd2BHfIa6UkFVdtjf5mKR3/eTPFvsiP/XV/VLi31234=',
cfId: undefined,
attempts: 1,
totalRetryDelay: 0
},
ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
ServerSideEncryption: 'AES256',
Bucket: 'test-js-6426',
Key: 'object_16',
Location: 'https://test-js-6426.s3.localhost.localstack.cloud:4566/object_16'
},
totalSize: 0
}
sending request: GET for /object_16
S3
sending request: PUT for /object_680
received response: PUT for /object_680
{
singleUploadResult: {
'$metadata': {
httpStatusCode: 200,
requestId: '6V429CE0N165S0P4',
extendedRequestId: 'LfqfFnulFE0fmGQPz6j686QG7U9iOLdC3oRL5p1iYzhU4UY4T5ErDTck+iGUJ0tRB+/lLF/YXiI=',
cfId: undefined,
attempts: 1,
totalRetryDelay: 0
},
ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
ServerSideEncryption: 'AES256',
Bucket: 'test-js-6426',
Key: 'object_680',
Location: 'https://test-js-6426.s3.us-west-2.amazonaws.com/object_680'
},
totalSize: 0
}
sending request: GET for /object_680
received response: GET for /object_680
sending request: PUT for /object_681
received response: PUT for /object_681
{
singleUploadResult: {
'$metadata': {
httpStatusCode: 200,
requestId: '6V4BD41ZD8HAYBBS',
extendedRequestId: 'pFk7+O+liFmPtXbfA/q4LoKKKp0y3XnCM139DnBXvsKJtFHYrehFn6EUnAK/EN3WNCi4JkuEBHM=',
cfId: undefined,
attempts: 1,
totalRetryDelay: 0
},
ETag: '"d41d8cd98f00b204e9800998ecf8427e"',
ServerSideEncryption: 'AES256',
Bucket: 'test-js-6426',
Key: 'object_681',
Location: 'https://test-js-6426.s3.us-west-2.amazonaws.com/object_681'
},
totalSize: 0
}
sending request: GET for /object_681
To check if NodeHttpHandler is receiving response, I made the following changes
File edited: node_modules/@smithy/node-http-handler/dist-cjs/index.js
const requestFunc = isSSL ? import_https.request : import_http.request;
console.log(`[internal] sending request: ${request.method} for ${request.path}`);
const req = requestFunc(nodeHttpsOptions, (res) => {
console.log(`[internal] received response: ${request.method} for ${request.path}`);
Logs show that NodeHttpHandler is not receiving response, i.e. the callback for https.request is not called
[internal] sending request: GET for /object_34
[internal] received response: GET for /object_34
[internal] sending request: PUT for /object_35
[internal] received response: PUT for /object_35
[internal] sending request: GET for /object_35
file:///Users/trivikr/workspace/aws-sdk-lib-storage-race/node_modules/p-timeout/index.js:70
const timeoutError = new TimeoutError();
^
TimeoutError: Promise timed out after 5000 milliseconds
at file:///Users/trivikr/workspace/aws-sdk-lib-storage-race/node_modules/p-timeout/index.js:70:24
at new Promise (<anonymous>)
at pTimeout (file:///Users/trivikr/workspace/aws-sdk-lib-storage-race/node_modules/p-timeout/index.js:48:25)
at file:///Users/trivikr/workspace/aws-sdk-lib-storage-race/index.js:84:9
at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
Node.js v22.18.0
While reading the code, I came across this comment:
https://github.com/aws/aws-sdk-js-v3/blob/756643ecd734b143ded1f46997ca7c3488cda313/lib/lib-storage/src/Upload.ts#L348-L356
This could be a hint as to what is wrong. This comment is basically inaccurate. Because of the Promise.all right above, we can actually end up calling the AbortMultipartUpload while some promises are still pending.
Here is a PR that might help with this issue, though I'm not sure as I'm not able to reproduce the error locally.
@matthieusieben That seems very promising! I tried to apply your patch but even with that in place I can still reproduce the issue against both AWS S3 and against Localstack.
I've been digging around a bit more and overall I think I've managed to conclude the following:
- The issue is not specific to
@aws-sdk/lib-storageand can easily be reproduced by just using the regular S3 client withPutObjectcommands. - One potential workaround for the issue is to avoid invoking
httpRequest.end()with an empty buffer, and instead invokinghttpRequest.end()without arguments. - The underlying issue is likely a race within Node.js itself, specifically between the 'free' event handler on
HttpAgentand the 'finish' event handler in theHttpClient. What I'm basically seeing is a socket getting freed up and used for a new request, followed by thefinishhandler on the old request wiping the socket clean and thus blocking the response to the new request from getting received. I'm guessing that the wrong assumption is that the socket is still associated with the old request at that time.
I'll try to follow up with updates to the minimal reproduce repository and a bug report in the Node.js repo.
I simplified the reproduce so it directly uses PutObject commands.
I also reported a new issue here: https://github.com/nodejs/node/issues/60001.
I suppose this issue could potentially get closed - I don't think that the root cause is inside @aws-sdk/lib-storage.