vertx-circuit-breaker
vertx-circuit-breaker copied to clipboard
Higher memory usage, blocked thread when using circuit breaker
Version
Context
I am getting high memory usage with the circuit breaker. I am just using httpbin.org to get all success responses. For individual requests, it works fine. While running a load test the JVM old gen utilization is spiking up.
Reproducer
I have the main verticle code, pasting it here itself:
` public class CleanServer extends AbstractVerticle { Logger logger = Logger.getLogger(CleanServer.class.getName());
@Override
public void start(Promise<Void> startPromise) throws Exception {
Router router = Router.router(vertx);
CircuitBreakerCache cbc = new CircuitBreakerCache(vertx);
router.route(HttpMethod.GET, "/get").handler(context -> {
List<String> domains = context.queryParam("user");
String domain = domains.get(0);
CircuitBreaker cb = cbc.getCircuitBreaker(domain + context.request().path());
HttpServerResponse serverResponse =
context.response().setChunked(true);
cb.executeWithFallback(promise -> {
WebClientOptions options = new WebClientOptions().setTryUseCompression(true).setTcpNoDelay(true).setTcpCork(true).setReceiveBufferSize(128).setConnectTimeout(400);
WebClient client = WebClient.create(vertx, options);
client.get(80, "httpbin.org", "/status/200")
.timeout(2000)
.send(ar -> {
if (ar.succeeded()) {
HttpResponse<Buffer> response = ar.result();
int statusCode = response.statusCode();
if (statusCode != 200) {
promise.fail(response.statusMessage());
} else {
serverResponse.end("Hello!!");
promise.complete();
}
} else {
promise.fail(ar.cause().getMessage());
}
});
}, v -> {
// Executed when the circuit is opened
logger.log(Level.INFO, domain + " Failed " + cb.state().toString() + " Error: Circuit open");
serverResponse.setStatusCode(200).setStatusMessage("Circuit Open").end("Circuit Open");
return context;
});
});
// Create the HTTP server
vertx.createHttpServer(new HttpServerOptions().setMaxInitialLineLength(10000))
// Handle every request using the router
.requestHandler(router)
// Start listening
.listen(8080)
// Print the port
.onSuccess(server ->
System.out.println(
"HTTP server started on port " + server.actualPort()
)
);
}
} `
Circuit breaker options:
CircuitBreakerOptions() .setMaxFailures(50) .setTimeout(5000) .setFallbackOnFailure(true) .setResetTimeout(10000)));
Steps to reproduce
- API used: http://localhost:8080/get?user=abc
- When I am hitting the above API at 50 QPS for 30 minutes. The java heap is getting filled up.
Extra
<vertx.version>4.2.6</vertx.version>
JVM params used: -XX:+UseG1GC -Xms4g -Xmx4g -XX:InitiatingHeapOccupancyPercent=70 -XX:MaxGCPauseMillis=200 -XX:ParallelGCThreads=20 -XX:ConcGCThreads=5
JVM memory with the load test.
Error: WARNING: Thread Thread[vert.x-eventloop-thread-3,5,main] has been blocked for 3050 ms, time limit is 2000 ms
I think I am blocking the thread somewhere but not sure where exactly as the code seems pretty simple as given in the documentation.
I've seen this issue as well, at ~ 4 PM, I deployed a new version of our application without the circuit breaker and the memory consumption wasn't ever-growing anymore as before.
An heap dump showed that there were many byte[]
referenced by VertxUnsafeHeapByteBuf
, that reference a HttpResponseImpl
which is what our application returns as result of the VertxCircuitBreaker.execute()
method (it's a Future<HttpResponse<Buffer>>
:
Thanks, I'll take a look later this week
Do you have any update on this? @tsegismont
@jyotipatel sorry not yet. I should be able to look into it next week
In the mean time, is there a way to disable metrics altogether? As in the heap dump, I see histogram is taking the highest amount of memory
No. I believe we should disable the CircuitBreakerMetrics
when eventbus notifications are disabled.
I tried reducing the DEFAULT_METRICS_ROLLING_BUCKETS and the full GC trigger is delayed with it. As the old generation memory heap is still increasing but with slower rate this time.
So, the issue seems to be with the CircuitBreakerMetrics
, more specifically HdrHistogram
.
Thought this might help you in the fix @tsegismont .
Thanks for sharing your findings
@jyotipatel I wasn't able to reproduce using 4.2.6
or 4.3.1
. Perhaps you can share with us a full reproducer as a GH project? Ideally it would have runnable server code and load testing instructions.
I have a couple remarks regarding the snippet.
Why do you create one WebClient per remote call? Usually a single web client per verticle is often enough. Also, it looks unusual to create a circuit breaker per request path. Why do you need so many of them
I've sent a PR to disable metrics computation anyway: https://github.com/vert-x3/vertx-circuit-breaker/pull/61
@pierDipi your memory problem looks different, can you share with us a small reproducer or at least the heap dump file?
@tsegismont Thanks for looking into it. I will add the reproducer by end of the next week. I will also try to reduce the client creation. Our test case is to break the circuit for a particular user in case the failure is for that user only. Thus we need to have a circuit breaker per path and user_id combination. user_id will be a URL parameter in our case.
@tsegismont I have shared the reproducer code in github with you as a collaborator: https://github.com/jyotipatel/vertx-circuit-breaker-httpbin. Please take a look.
@jyotipatel can you please send the invitation again, it has expired. Thank you
Sent again @tsegismont
Thank you @jyotipatel
@jyotipatel I have been able to reproduce and found the issue is in the CircuitBreakerCache
implementation. It creates a CircuitBreaker
any time the hash map getOrDefault
method is invoked. But then this instance is never collected (even if the key is already in the map) because the circuit breaker schedules a periodic timer that keeps a reference to it.
In order to keep a reasonable number of circuit breaker implementations in your app, change your implementation to:
public CircuitBreaker getCircuitBreaker(String key) {
CircuitBreaker cb = circuitBreakers.get(key);
if (cb == null) {
cb = CircuitBreaker.create(key, vertx,
new CircuitBreakerOptions()
.setMaxFailures(50)
.setTimeout(5000)
.setFallbackOnFailure(true)
.setResetTimeout(10000));
circuitBreakers.put(key, cb);
}
return cb;
}
And I would recommend again to avoid creating a WebClient
for each incoming request.