vert.x icon indicating copy to clipboard operation
vert.x copied to clipboard

performance degradation when no handler

Open ckoutsouridis opened this issue 1 year ago • 1 comments

i have noticed that when we publish messages on addresses with no handlers, it is actually more expensive than publishing messages on addresses with simple consumers.

the bellow example demonstrates

    @Test
    void noHandlers() {

        var vertx = Vertx.vertx();

        var eventBus = vertx.eventBus();

        AtomicLong atomicLong = new AtomicLong();
        eventBus.localConsumer("test", event -> {
            atomicLong.incrementAndGet();
        });
        eventBus.publish("test", "payload");

        eventBus.publish("test", "payload");

        eventBus.publish("test", "payload");

        long start = System.currentTimeMillis();
        for (int i = 0; i < 100000; i++) {
            eventBus.publish("test", "payload");
        }
        long end = System.currentTimeMillis();
        System.out.println("time with consumer: "+ (end - start));


        start = System.currentTimeMillis();
        for (int i = 0; i < 100000; i++) {
            eventBus.publish("no addr", "payload");
        }
        end = System.currentTimeMillis();
        System.out.println("time with consumer: "+ (end - start));
    }

which outputs on my machine

time with consumer: 185
time with no consumer: 895

i believe this boils down to the way the no handlers is modelled by throwing exceptions and concatenating strings in EventBusImpl.deliverMessageLocally

    ConcurrentCyclicSequence<HandlerHolder> handlers = handlerMap.get(msg.address());
    boolean messageLocal = isMessageLocal(msg);
    if (handlers != null) {
       ...
      return null;
    } else {
      if (metrics != null) {
        metrics.messageReceived(msg.address(), !msg.isSend(), messageLocal, 0);
      }
      return new ReplyException(ReplyFailure.NO_HANDLERS, "No handlers for address " + msg.address);
    }

ckoutsouridis avatar Jul 26 '24 06:07 ckoutsouridis

Would you be able to create a microbenchmark with JMH (we have tests like this already in Vert.x core)?

With such a test it would be possible to validate changes making the no_handlers case more efficient.

tsegismont avatar Jul 26 '24 10:07 tsegismont

Amazing, someone has already encountered this problem and raised it.This is indeed a problem. I have already proven it locally.

I have an MQTT broker developed based on Vert.x MQTT, which passes event messages through the event bus. Event messages sent through the event bus may not necessarily have handlers to handle, and without handlers, the overall performance of the MQTT broker experiences a significant decline.

It took me a long time to initially identify that the performance issue was caused by sending event messages without a handler. Specifically, in this commit: vxmq. In this commit, I cancelled the registration of empty handlers for all event bus addresses, resulting in a sharp decline in performance.

@tsegismont @ckoutsouridis

jpforevers avatar Dec 03 '24 09:12 jpforevers

Amazing, someone has already encountered this problem and raised it.This is indeed a problem. I have already proven it locally.

I have an MQTT broker developed based on vert. x MQTT, which passes event messages through the event bus. Event messages sent through the event bus may not necessarily have handlers to handle, and without handlers, the overall performance of the MQTT broker experiences a significant decline.

It took me a long time to initially identify that the performance issue was caused by sending event messages without a handler. Specifically, in this commit: vxmq. In this commit, I cancelled the registration of empty handlers for all event bus addresses, resulting in a sharp decline in performance.

@tsegismont @ckoutsouridis

Ironically, the name of this commit is 'Optimize performance: remove an empty event consumption that may be due to performance issues', because I thought removing the empty handlers would improve performance, but it actually reduced performance. HaHa.

jpforevers avatar Dec 03 '24 09:12 jpforevers

Amazing, someone has already encountered this problem and raised it.This is indeed a problem. I have already proven it locally.

I have an MQTT broker developed based on Vert.x MQTT, which passes event messages through the event bus. Event messages sent through the event bus may not necessarily have handlers to handle, and without handlers, the overall performance of the MQTT broker experiences a significant decline.

It took me a long time to initially identify that the performance issue was caused by sending event messages without a handler. Specifically, in this commit: vxmq. In this commit, I cancelled the registration of empty handlers for all event bus addresses, resulting in a sharp decline in performance.

@tsegismont @ckoutsouridis

It should be clarified that I have only confirmed that canceling the registration of empty handlers for all eventbus addresses will result in performance degradation, but I do not know the specific reason

jpforevers avatar Dec 04 '24 01:12 jpforevers

anyone can share a benchmark ?

vietj avatar Dec 04 '24 08:12 vietj

anyone can share a benchmark ?

I am testing locally and will provide the benchmark asap.

jpforevers avatar Dec 05 '24 01:12 jpforevers

anyone can share a benchmark ?

This is my benchmark testing project: eventbus-with-without-handler-benchmark. Start by: mvn clean compile exec:java, this is result on my computer:

Benchmark                                             (mode)   Mode  Cnt       Score      Error  Units
EventBusBenchmark.sendToAddressWithHandler     non-clustered  thrpt   25  151029.036 ± 2798.349  ops/s
EventBusBenchmark.sendToAddressWithHandler         clustered  thrpt   25  143005.199 ± 2476.310  ops/s
EventBusBenchmark.sendToAddressWithoutHandler  non-clustered  thrpt   25  150226.319 ± 3519.441  ops/s
EventBusBenchmark.sendToAddressWithoutHandler      clustered  thrpt   25   35797.632 ±  633.378  ops/s

The conclusion is that in cluster mode, there is a significant decrease in performance when publishing data to eventbus addresses without registered handlers. This is consistent with the performance of my MQTT broker code.

@vietj @tsegismont @ckoutsouridis

jpforevers avatar Dec 17 '24 00:12 jpforevers

@vietj @tsegismont Please checkout my benchmark.

jpforevers avatar Dec 19 '24 01:12 jpforevers

@ckoutsouridis I tried the test from the issue's description and got opposite results. Anyway, this is not an actual benchmark so we can draw conclusions from it.

I wrote a simple JMH benchmark:

package io.vertx.benchmarks;

import io.vertx.core.Vertx;
import io.vertx.core.eventbus.EventBus;
import io.vertx.core.eventbus.Message;
import io.vertx.core.eventbus.MessageConsumer;
import org.openjdk.jmh.annotations.*;

import java.util.concurrent.TimeUnit;

@Warmup(iterations = 10, time = 200, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 200, timeUnit = TimeUnit.MILLISECONDS)
@Threads(1)
@BenchmarkMode(Mode.Throughput)
@Fork(value = 2)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Thread)
public class EventBusSendAndPublishBenchmark {

  private static final String ADDRESS_WITH_CONSUMER = "foo";
  private static final String ADDRESS_WITHOUT_CONSUMER = "bar";
  private static final String MSG = "msg";

  Vertx vertx;
  EventBus eventBus;
  MessageConsumer<String> consumer;

  @Setup
  public void setup() {
    vertx = Vertx.vertx();
    eventBus = vertx.eventBus();
    consumer = eventBus.consumer(ADDRESS_WITH_CONSUMER, EventBusSendAndPublishBenchmark::consume);
  }

  @CompilerControl(CompilerControl.Mode.DONT_INLINE)
  public static void consume(Message<String> ignore) {
  }

  @TearDown
  public void tearDown() throws Exception {
    consumer.unregister().await();
    vertx.close().await();
  }

  @Benchmark
  public void publish() {
    eventBus.publish(ADDRESS_WITH_CONSUMER, MSG);
  }

  @Benchmark
  public void publishNoHandlers() {
    eventBus.publish(ADDRESS_WITHOUT_CONSUMER, MSG);
  }
}

And got these results (admittedly, on my laptop):

Benchmark                                           Mode  Cnt     Score     Error   Units
EventBusSendAndPublishBenchmark.publish            thrpt   20  1505.896 ± 913.070  ops/ms
EventBusSendAndPublishBenchmark.publishNoHandlers  thrpt   20  4902.226 ± 618.335  ops/ms

There's quite some noise, but the results don't show any degradation when there are no handlers.

tsegismont avatar Jan 06 '25 09:01 tsegismont

@jpforevers the problem you are talking about is different: you're pointing to the fact that, in clustered mode, it's less performant to send (or publish) when there are no handlers.

This is expected: we can cache and maintain eventbus subscription lookups when there is a handler, we cannot when there isn't (we can't cache all lookups that would result in an empty result).

tsegismont avatar Jan 06 '25 09:01 tsegismont

Closing the issue as it is seems there is no bug denoted by the issue

tsegismont avatar Jan 06 '25 09:01 tsegismont