_EXCEPTION_ACCESS_VIOLATION_ on multipart upload cancellation with S3 AsyncClient
Describe the bug
We transfer a large file to S3 in multi-part mode, with the S3AsyncClient. The file is read from the disk, using an AsynchronousFileChannel and a NettyDataBufferFactory. The basic sequence is the following:
- We initiate a multipart upload to S3
- A chunk of the file is read and copied into a ByteBuff (/DataBuffer) ** - see code A**
- The chunk is sent to S3 with a UploadPart
- When the chunk has been sent, the underlying ByteBuff is released ** - see code B**
- Start again for a new chunk (...)
- (loop)
- Complete the multipart upload
As a side note, the application is based on Spring WebFlux, and the chunk processing above is parallelized. At one instant T, you can have X chunks beeing sent to S3.
Normal case, everything works fine. But when we try to cancel our request the JVM dies with an EXCEPTION_ACCESS_VIOLATION. This seems to be linked with the buffer release, because if we remove this code we never have this bug.
Occasionally, a log is also displayed before it dies, but I'm not sure it's linked:
WARN 24668 --- [tyEventLoop-0-2] s.a.a.h.n.n.i.FutureCancelHandler : [Channel: cce56e59] Received a cancellation exception on a channel that doesn't have an execution Id attached. Exception's execution ID is null. Exception is being ignored. Closing the channel
Code A
Reading the content of the file into ByteBuff (=DataBuffer)
public Flux<DataBuffer> getContent() {
return DataBufferUtils
.readAsynchronousFileChannel(
() -> AsynchronousFileChannel.open(filePath, StandardOpenOption.READ),
new NettyDataBufferFactory(ByteBufAllocator.DEFAULT), FILECONTEXT_READ_BUFFER_SIZE);
}
Code B
Process the Future of the UploadPart and release the ByteBuff (=DataBuffer) when it's over.
return Mono.fromFuture(uploadPartRequestFuture)
.doFinally(signalType -> DataBufferUtils.release(dataBuffer)) // <=== Here
.flatMap(uploadPartResult -> {
LOGGER.info("Upload part complete: part={}, etag={}", partNumber,
uploadPartResult.eTag());
return checkS3Response(uploadPartResult)
.thenReturn(CompletedPart.builder()
.eTag(uploadPartResult.eTag())
.partNumber(partNumber)
.build());
});
Expected Behavior
The request is cancelled, all the buffers are released and the application is running fine.
Current Behavior
The request is cancelled and the jvm dies with a EXCEPTION_ACCESS_VIOLATION
Reproduction Steps
You will need access to a S3. Check the README to see the env variables that you have to use.
For the test itself, use Postman or equivalent and send a file of some hundreds megabytes. While it's uploading to your S3 service, click on "Cancel" in Postman and it should trigger it quite regularly.
Possible Solution
Our current workaround is to add a delay before releasing the DataBuffer and if we receive a CANCEL signal :
if (signalType.equals(CANCEL) || signalType.equals(ON_ERROR)) {
Mono.delay(Duration.ofSeconds(5))
.doOnNext(unused -> {
LOGGER.info("releaseDataBuffer() waited 5sec for release of the dataBuffer");
DataBufferUtils.release(dataBuffer);
}).subscribe();
}
Additional Information/Context
Please ask for any precision.
AWS Java SDK version used
2.21.15
JDK version used
OpenJDK Runtime Environment Temurin-17.0.6+10
Operating System and version
Windows 11