java.net.SocketException in SHUTDOWN phase while AWS Lambda (Quarkus + GraalVM) is connected with AwsAppConfigExtension
Describe the bug
Hello everybody.
I'm struggling with AWS Lambda working on:
- run with: Quarkus (3.26.2)
- built with: GraalVM Oracle (21.0.7)
- architecture: arm64
- runtime: Amazon Linux 2
- handler: io.quarkus.amazon.lambda.runtime.QuarkusStreamHandler::handleRequest
with pinned extension layer:
- name: AWS-AppConfig-Extension-Arm64
- version: 132
- compatible-architectures: arm64
This is extension for AWS-AppConfig. Lambda itself does not have any code related to AppConfig connection etc. It is just clean build with following dependencies:
- implementation("io.quarkus:quarkus-amazon-lambda")
- implementation("software.amazon.awssdk:url-connection-client")
and the simplest requestHandler class:
import com.amazonaws.services.lambda.runtime.Context
import com.amazonaws.services.lambda.runtime.RequestHandler
import com.amazonaws.services.lambda.runtime.events.DynamodbEvent
import com.test.logging.Logging
class Lambda : RequestHandler<DynamodbEvent, String> {
private val logger = Logging.logger { }
override fun handleRequest(
event: DynamodbEvent,
context: Context
): String {
logger.info { "Everything OK" }
return "OK"
}
}
Whenever the lambda is being invoked everything runs normal until the SHUTDOWN phase where exception "java.net.SocketException" occurs:
{
"@timestamp": "2025-09-08T15:28:52.825Z",
"log.level": "ERROR",
"process.thread.name": "Lambda Thread (NORMAL)",
"error.stack_trace": "java.net.SocketException: Socket closed\n\tat [email protected]/sun.nio.ch.NioSocketImpl.endRead(NioSocketImpl.java:243)\n\tat [email protected]/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:323)\n\tat [email protected]/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346)\n\tat [email protected]/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796)\n\tat [email protected]/java.net.Socket$SocketInputStream.read(Socket.java:1099)\n\tat [email protected]/java.io.BufferedInputStream.fill(BufferedInputStream.java:291)\n\tat [email protected]/java.io.BufferedInputStream.read1(BufferedInputStream.java:347)\n\tat [email protected]/java.io.BufferedInputStream.implRead(BufferedInputStream.java:420)\n\tat [email protected]/java.io.BufferedInputStream.read(BufferedInputStream.java:399)\n\tat [email protected]/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:827)\n\tat [email protected]/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:759)\n\tat [email protected]/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1706)\n\tat [email protected]/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615)\n\tat [email protected]/sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:3251)\n\tat io.quarkus.amazon.lambda.runtime.AbstractLambdaPollLoop$1.run(AbstractLambdaPollLoop.java:95)\n\tat [email protected]/java.lang.Thread.runWith(Thread.java:1596)\n\tat [email protected]/java.lang.Thread.run(Thread.java:1583)\n\tat org.graalvm.nativeimage.builder/com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:896)\n\tat org.graalvm.nativeimage.builder/com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:872)\n",
"error.type": "java.net.SocketException",
"error.message": "Socket closed",
"message": "Error running lambda (NORMAL) [Error Occurred After Shutdown]",
"ecs.version": "1.12.1"
}
Expected behavior
The poll loop should quietly terminate during container shutdown without logging SocketException: Socket closed as an ERROR.
Actual behavior
With the AWS AppConfig Extension attached, the poll loop logs SocketException: Socket closed in the Quarkus stack trace during container shutdown.
Below I attached 2 screenshots where I picture logs from CloudWatch of Lambda's whole lifecycle:
and then after some minutes of container being IDLE... it goes to SHUTDOWN phase:
Black boxes just hide sensitive data of my company but these logs are completely unrelated.
How to Reproduce?
No response
Output of uname -a or ver
No response
Output of java -version
21.0.7
Mandrel or GraalVM version (if different from Java)
GraalVM Oracle 21.0.7
Quarkus version or git rev
3.26.2
Build tool (ie. output of mvnw --version or gradlew --version)
8.14
Additional information
If I remove the extension layer then the exception does not occur.
The issue occurs only in SHUTDOWN phase of the whole lambda's lifecycle:
and it does not interrupt anyhow earlier phases. Whole logic of phases INIT and multiple INVOKES runs correctly.
Apparently something is happening when io.quarkus.amazon.lambda.runtime.AbstractLambdaPollLoop tries to reach AppConfigAgent (the layer) asynchronously while the extension itself is being shutdown - both happen basically in the same moment (you can check timestamps). This is my only guess.
Could you please support me somehow with this issue? Even if INIT and INVOKE phases are not interrupted by the issue and after all the container shutdowns correctly... I'd like to understand why is it even happening and how to avoid it? Maybe it should be ignored as harmless behavior but I'm still quite concerned.
/cc @Karm (native-image), @galderz (native-image), @patriot1burke (amazon-lambda), @radcortez (config), @zakkak (native-image)
@DudekJakub I see you are using GraalVM, does this issue appear when you compile your application to native or in JVM-mode as well?
@zakkak we deploy AWS Lambdas with GraalVM, Dockers with JVM so we do not have comparison.
Are you building a native executable?
Yes. We build ./gradlew build -Dquarkus.native.enabled=true and use function.zip as an artifact.
We have the same issue, but it doesn't seem to be specific to the AwsAppConfigExtension layer.
In our case, we also build native executables for Lambda, it's the OpenTelemetry Collector Lambda layer.
We are experiencing the same problem. It also happens with our native compiled lambdas when using and otel collector reporting to a lambda layer.
Does anyone have a reproducer with OTel collector that I can use to see the problem in action?
Closing for a lack of a reproducer. If one becomes available, we can certainly reexamine the issue
@geoand what kind of reproducer do you expect? Sth that is failing on cloud (but you do not have test in project or load test config) is enough?
Hi @geoand and others,
I was able to create a simple reproducer project using the AWS lambda tutorial with an AWS AppConfig layer.
The reproducer is available here: https://github.com/dagrammy/lambda-layer-reproducer and the readme describes how to build the project and deploy it to AWS.
One important thing I noticed during development, which is also mentioned in the readme file, is that I was only able to reproduce the error after increasing the Lambda memory. With 256M, the error could not be reproduced but with 1GB, for example.
My guess is that the application/Lambda shuts down faster than the layer when the Lambda has more memory.
1GB: lambda-layer-reproducer stopped in 0.002s
256MB: lambda-layer-reproducer stopped in 0.016s
Hope this reproducer helps :)