ambry
ambry copied to clipboard
ByteBuf leak in LocalNetworkClient
It looks like LocalNetworkClient has some memory leak. @lightningrob @justinlin-linkedin
It can be constantly reproduced by testStitchGetUpdateDelete test.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/zemao/.gradle/caches/modules-2/files-2.1/org.slf4j/slf4j-log4j12/1.7.25/110cefe2df103412849d72ef7a67e4e91e4266b4/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/zemao/.gradle/caches/modules-2/files-2.1/org.slf4j/slf4j-log4j12/1.7.5/6edffc576ce104ec769d954618764f39f0f0f10d/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[2020-11-04 22:30:43,925] INFO The metric scope of adaptive tracker is Datacenter (com.github.ambry.router.NonBlockingRouterMetrics)
[2020-11-04 22:30:43,997] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:44,003] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,009] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,009] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:44,010] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,010] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,017] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:44,017] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,017] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,017] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:44,017] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,017] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:44,320] INFO Creating cloud blob store for partition 7 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:44,854] INFO Creating cloud blob store for partition 3 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:44,863] INFO Creating cloud blob store for partition 2 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:44,992] INFO Creating cloud blob store for partition 1 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,001] INFO Creating cloud blob store for partition 0 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,005] INFO Creating cloud blob store for partition 8 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,103] INFO Creating cloud blob store for partition 6 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,111] INFO Creating cloud blob store for partition 5 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,146] INFO Closing the router (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,147] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,149] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,149] INFO shutting down (com.github.ambry.protocol.RequestHandlerPool)
[2020-11-04 22:30:45,150] INFO shut down completely (com.github.ambry.protocol.RequestHandlerPool)
[2020-11-04 22:30:45,250] INFO The metric scope of adaptive tracker is Datacenter (com.github.ambry.router.NonBlockingRouterMetrics)
[2020-11-04 22:30:45,276] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,276] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,277] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,277] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,277] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,278] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,279] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,279] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,280] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,280] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,280] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,280] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,287] INFO Creating cloud blob store for partition 1 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,298] INFO Creating cloud blob store for partition 3 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,307] INFO Creating cloud blob store for partition 7 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,318] INFO Creating cloud blob store for partition 2 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,363] INFO Creating cloud blob store for partition 6 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,368] INFO Creating cloud blob store for partition 5 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,422] ERROR LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records:
Created at:
io.netty.buffer.AbstractByteBufAllocator.compositeHeapBuffer(AbstractByteBufAllocator.java:213)
com.github.ambry.protocol.PutRequest.prepareBuffer(PutRequest.java:210)
com.github.ambry.protocol.PutRequest.writeTo(PutRequest.java:229)
com.github.ambry.network.LocalRequestResponseChannel.byteBufFromPayload(LocalRequestResponseChannel.java:125)
com.github.ambry.network.LocalNetworkClient.sendAndPoll(LocalNetworkClient.java:83)
com.github.ambry.network.CompositeNetworkClient.lambda$null$1(CompositeNetworkClient.java:97)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745) (io.netty.util.ResourceLeakDetector)
[2020-11-04 22:30:45,422] ERROR LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records:
Created at:
io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:385)
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187)
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178)
io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:139)
com.github.ambry.protocol.RequestOrResponse.prepareBuffer(RequestOrResponse.java:90)
com.github.ambry.protocol.GetRequest.prepareBuffer(GetRequest.java:94)
com.github.ambry.protocol.RequestOrResponse.writeTo(RequestOrResponse.java:98)
com.github.ambry.network.LocalRequestResponseChannel.byteBufFromPayload(LocalRequestResponseChannel.java:125)
com.github.ambry.network.LocalNetworkClient.sendAndPoll(LocalNetworkClient.java:83)
com.github.ambry.network.CompositeNetworkClient.lambda$null$1(CompositeNetworkClient.java:97)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745) (io.netty.util.ResourceLeakDetector)
[2020-11-04 22:30:45,423] ERROR LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records:
Created at:
io.netty.buffer.AbstractByteBufAllocator.compositeDirectBuffer(AbstractByteBufAllocator.java:223)
io.netty.buffer.AbstractByteBufAllocator.compositeDirectBuffer(AbstractByteBufAllocator.java:218)
com.github.ambry.protocol.GetResponse.prepareBuffer(GetResponse.java:142)
com.github.ambry.protocol.GetResponse.writeTo(GetResponse.java:156)
com.github.ambry.network.LocalRequestResponseChannel.byteBufFromPayload(LocalRequestResponseChannel.java:125)
com.github.ambry.network.LocalRequestResponseChannel.sendResponse(LocalRequestResponseChannel.java:71)
com.github.ambry.protocol.AmbryRequests.sendGetResponse(AmbryRequests.java:774)
com.github.ambry.protocol.AmbryRequests.handleGetRequest(AmbryRequests.java:386)
com.github.ambry.protocol.AmbryRequests.handleRequests(AmbryRequests.java:122)
com.github.ambry.protocol.RequestHandler.run(RequestHandler.java:47)
java.lang.Thread.run(Thread.java:745) (io.netty.util.ResourceLeakDetector)
[2020-11-04 22:30:45,451] INFO Creating cloud blob store for partition 8 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,455] INFO Creating cloud blob store for partition 4 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:45,500] INFO Closing the router (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,501] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,503] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,504] INFO shutting down (com.github.ambry.protocol.RequestHandlerPool)
[2020-11-04 22:30:45,506] INFO shut down completely (com.github.ambry.protocol.RequestHandlerPool)
[2020-11-04 22:30:45,590] INFO The metric scope of adaptive tracker is Datacenter (com.github.ambry.router.NonBlockingRouterMetrics)
[2020-11-04 22:30:45,595] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,595] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,596] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,597] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,597] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,598] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,600] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,601] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,602] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,602] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:45,603] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:45,603] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,101] INFO Creating cloud blob store for partition 0 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,112] INFO Creating cloud blob store for partition 1 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,121] INFO Creating cloud blob store for partition 2 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,205] INFO Creating cloud blob store for partition 7 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,209] INFO Creating cloud blob store for partition 8 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,262] INFO Creating cloud blob store for partition 5 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,278] ERROR LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html for more information.
Recent access records:
Created at:
io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:385)
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187)
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178)
io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:139)
com.github.ambry.protocol.RequestOrResponse.prepareBuffer(RequestOrResponse.java:90)
com.github.ambry.protocol.DeleteRequest.prepareBuffer(DeleteRequest.java:75)
com.github.ambry.protocol.RequestOrResponse.writeTo(RequestOrResponse.java:98)
com.github.ambry.network.LocalRequestResponseChannel.byteBufFromPayload(LocalRequestResponseChannel.java:125)
com.github.ambry.network.LocalNetworkClient.sendAndPoll(LocalNetworkClient.java:83)
com.github.ambry.network.CompositeNetworkClient.lambda$null$1(CompositeNetworkClient.java:97)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745) (io.netty.util.ResourceLeakDetector)
[2020-11-04 22:30:46,318] INFO Closing the router (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,318] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,319] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,321] INFO shutting down (com.github.ambry.protocol.RequestHandlerPool)
[2020-11-04 22:30:46,322] INFO shut down completely (com.github.ambry.protocol.RequestHandlerPool)
[2020-11-04 22:30:46,403] INFO The metric scope of adaptive tracker is Datacenter (com.github.ambry.router.NonBlockingRouterMetrics)
[2020-11-04 22:30:46,407] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,408] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,408] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,409] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,409] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,409] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,411] INFO Warming up local datacenter connections to 3 nodes. Connections warmup percentage: 67%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,412] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,412] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,412] INFO Warming up remote datacenter connections to 6 nodes. Connections warmup percentage: 34%. (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,412] INFO Connection warm up start. (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,412] INFO Connection warm up done. Tried: 6, Succeeded: 6, Failed: 0, Time elapsed: 0 ms (com.github.ambry.network.SocketNetworkClient)
[2020-11-04 22:30:46,423] INFO Creating cloud blob store for partition 7 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,434] INFO Creating cloud blob store for partition 6 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,442] INFO Creating cloud blob store for partition 1 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,450] INFO Creating cloud blob store for partition 2 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,513] INFO Creating cloud blob store for partition 4 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,552] INFO Creating cloud blob store for partition 8 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,558] INFO Creating cloud blob store for partition 5 with no cache (com.github.ambry.cloud.CloudBlobStore)
[2020-11-04 22:30:46,623] INFO Closing the router (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,623] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,624] INFO OperationController is shutting down (com.github.ambry.router.NonBlockingRouter)
[2020-11-04 22:30:46,624] INFO shutting down (com.github.ambry.protocol.RequestHandlerPool)
[2020-11-04 22:30:46,625] INFO shut down completely (com.github.ambry.protocol.RequestHandlerPool)
Process finished with exit code 0
I'm able to reproduce this by running the CloudRouterTest, but I don't yet know the reason for the leak. The ByteBuf that is allocated in RequestOrResponse::prepareBuffer ought to be released in the NonBlockingRouter::run loop on this line:
responseInfoList.forEach(ResponseInfo::release);
I will need to investigate if there are code paths where this does not happen.