performance degradation when no handler
i have noticed that when we publish messages on addresses with no handlers, it is actually more expensive than publishing messages on addresses with simple consumers.
the bellow example demonstrates
@Test
void noHandlers() {
var vertx = Vertx.vertx();
var eventBus = vertx.eventBus();
AtomicLong atomicLong = new AtomicLong();
eventBus.localConsumer("test", event -> {
atomicLong.incrementAndGet();
});
eventBus.publish("test", "payload");
eventBus.publish("test", "payload");
eventBus.publish("test", "payload");
long start = System.currentTimeMillis();
for (int i = 0; i < 100000; i++) {
eventBus.publish("test", "payload");
}
long end = System.currentTimeMillis();
System.out.println("time with consumer: "+ (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < 100000; i++) {
eventBus.publish("no addr", "payload");
}
end = System.currentTimeMillis();
System.out.println("time with consumer: "+ (end - start));
}
which outputs on my machine
time with consumer: 185
time with no consumer: 895
i believe this boils down to the way the no handlers is modelled by throwing exceptions and concatenating strings in EventBusImpl.deliverMessageLocally
ConcurrentCyclicSequence<HandlerHolder> handlers = handlerMap.get(msg.address());
boolean messageLocal = isMessageLocal(msg);
if (handlers != null) {
...
return null;
} else {
if (metrics != null) {
metrics.messageReceived(msg.address(), !msg.isSend(), messageLocal, 0);
}
return new ReplyException(ReplyFailure.NO_HANDLERS, "No handlers for address " + msg.address);
}
Would you be able to create a microbenchmark with JMH (we have tests like this already in Vert.x core)?
With such a test it would be possible to validate changes making the no_handlers case more efficient.
Amazing, someone has already encountered this problem and raised it.This is indeed a problem. I have already proven it locally.
I have an MQTT broker developed based on Vert.x MQTT, which passes event messages through the event bus. Event messages sent through the event bus may not necessarily have handlers to handle, and without handlers, the overall performance of the MQTT broker experiences a significant decline.
It took me a long time to initially identify that the performance issue was caused by sending event messages without a handler. Specifically, in this commit: vxmq. In this commit, I cancelled the registration of empty handlers for all event bus addresses, resulting in a sharp decline in performance.
@tsegismont @ckoutsouridis
Amazing, someone has already encountered this problem and raised it.This is indeed a problem. I have already proven it locally.
I have an MQTT broker developed based on vert. x MQTT, which passes event messages through the event bus. Event messages sent through the event bus may not necessarily have handlers to handle, and without handlers, the overall performance of the MQTT broker experiences a significant decline.
It took me a long time to initially identify that the performance issue was caused by sending event messages without a handler. Specifically, in this commit: vxmq. In this commit, I cancelled the registration of empty handlers for all event bus addresses, resulting in a sharp decline in performance.
@tsegismont @ckoutsouridis
Ironically, the name of this commit is 'Optimize performance: remove an empty event consumption that may be due to performance issues', because I thought removing the empty handlers would improve performance, but it actually reduced performance. HaHa.
Amazing, someone has already encountered this problem and raised it.This is indeed a problem. I have already proven it locally.
I have an MQTT broker developed based on Vert.x MQTT, which passes event messages through the event bus. Event messages sent through the event bus may not necessarily have handlers to handle, and without handlers, the overall performance of the MQTT broker experiences a significant decline.
It took me a long time to initially identify that the performance issue was caused by sending event messages without a handler. Specifically, in this commit: vxmq. In this commit, I cancelled the registration of empty handlers for all event bus addresses, resulting in a sharp decline in performance.
@tsegismont @ckoutsouridis
It should be clarified that I have only confirmed that canceling the registration of empty handlers for all eventbus addresses will result in performance degradation, but I do not know the specific reason
anyone can share a benchmark ?
anyone can share a benchmark ?
I am testing locally and will provide the benchmark asap.
anyone can share a benchmark ?
This is my benchmark testing project: eventbus-with-without-handler-benchmark. Start by: mvn clean compile exec:java, this is result on my computer:
Benchmark (mode) Mode Cnt Score Error Units
EventBusBenchmark.sendToAddressWithHandler non-clustered thrpt 25 151029.036 ± 2798.349 ops/s
EventBusBenchmark.sendToAddressWithHandler clustered thrpt 25 143005.199 ± 2476.310 ops/s
EventBusBenchmark.sendToAddressWithoutHandler non-clustered thrpt 25 150226.319 ± 3519.441 ops/s
EventBusBenchmark.sendToAddressWithoutHandler clustered thrpt 25 35797.632 ± 633.378 ops/s
The conclusion is that in cluster mode, there is a significant decrease in performance when publishing data to eventbus addresses without registered handlers. This is consistent with the performance of my MQTT broker code.
@vietj @tsegismont @ckoutsouridis
@vietj @tsegismont Please checkout my benchmark.
@ckoutsouridis I tried the test from the issue's description and got opposite results. Anyway, this is not an actual benchmark so we can draw conclusions from it.
I wrote a simple JMH benchmark:
package io.vertx.benchmarks;
import io.vertx.core.Vertx;
import io.vertx.core.eventbus.EventBus;
import io.vertx.core.eventbus.Message;
import io.vertx.core.eventbus.MessageConsumer;
import org.openjdk.jmh.annotations.*;
import java.util.concurrent.TimeUnit;
@Warmup(iterations = 10, time = 200, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 200, timeUnit = TimeUnit.MILLISECONDS)
@Threads(1)
@BenchmarkMode(Mode.Throughput)
@Fork(value = 2)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Thread)
public class EventBusSendAndPublishBenchmark {
private static final String ADDRESS_WITH_CONSUMER = "foo";
private static final String ADDRESS_WITHOUT_CONSUMER = "bar";
private static final String MSG = "msg";
Vertx vertx;
EventBus eventBus;
MessageConsumer<String> consumer;
@Setup
public void setup() {
vertx = Vertx.vertx();
eventBus = vertx.eventBus();
consumer = eventBus.consumer(ADDRESS_WITH_CONSUMER, EventBusSendAndPublishBenchmark::consume);
}
@CompilerControl(CompilerControl.Mode.DONT_INLINE)
public static void consume(Message<String> ignore) {
}
@TearDown
public void tearDown() throws Exception {
consumer.unregister().await();
vertx.close().await();
}
@Benchmark
public void publish() {
eventBus.publish(ADDRESS_WITH_CONSUMER, MSG);
}
@Benchmark
public void publishNoHandlers() {
eventBus.publish(ADDRESS_WITHOUT_CONSUMER, MSG);
}
}
And got these results (admittedly, on my laptop):
Benchmark Mode Cnt Score Error Units
EventBusSendAndPublishBenchmark.publish thrpt 20 1505.896 ± 913.070 ops/ms
EventBusSendAndPublishBenchmark.publishNoHandlers thrpt 20 4902.226 ± 618.335 ops/ms
There's quite some noise, but the results don't show any degradation when there are no handlers.
@jpforevers the problem you are talking about is different: you're pointing to the fact that, in clustered mode, it's less performant to send (or publish) when there are no handlers.
This is expected: we can cache and maintain eventbus subscription lookups when there is a handler, we cannot when there isn't (we can't cache all lookups that would result in an empty result).
Closing the issue as it is seems there is no bug denoted by the issue