rabbitmq-server icon indicating copy to clipboard operation
rabbitmq-server copied to clipboard

Khepri integration: AMQP declaration performance

Open mkuratczyk opened this issue 2 years ago • 2 comments

NOTE: this issue is related to an unreleased Khepri integration (branch khepri-queues). NOTE2: This issue has been significantly edited with a simpler test case

On my system perf-test -p -e amq.topic -t topic -qp q-%d -qpf 1 -qpt 10000 -y 0 -c 1 -C 1 takes 30 seconds with Mnesia and 80 seconds with Khepri. This is 10000 classic queues + 10000 topic bindings + 10000 direct bindings.

It's interesting to consider this story together with https://github.com/rabbitmq/rabbitmq-server/issues/5106. Definitions import is generally faster with Khepri with a dataset of this size, but it is almost 3 times slower when performed by an AMQP client.

mkuratczyk avatar Jun 24 '22 12:06 mkuratczyk

@mkuratczyk This should be solved on https://github.com/rabbitmq/rabbitmq-server/tree/khepri-queues-rebase/, commit https://github.com/rabbitmq/rabbitmq-server/commit/85a456b9ade158aeb467ae5aeee4f070f188411d On my laptop I'm getting now the same times than with mnesia.

dcorbacho avatar Jul 13 '22 13:07 dcorbacho

Adding --skip-binding-queues is also useful for comparison. It skips declaring the topic bindings, so gives us 10k queue declarations and 10k direct bindings. It's also slower:

For me, perf-test -p -e amq.topic -t topic -qp q-%d -qpf 1 -qpt 10000 -y 0 -c 1 -C 1 --skip-binding-queues takes:

  • mnesia: 21 seconds
  • khepri: 42 seconds

So Mnesia needs about 10 more seconds for 10k topic bindings, while Khepri needs 30-40 seconds to also add the bindings.

mkuratczyk avatar Jul 20 '22 15:07 mkuratczyk

Update after almost a year of development. Tests performed on the khepri branch with khepri_db FF disabled/enabled.

  1. perf-test -p -e amq.topic -t topic -qp q-%d -qpf 1 -qpt 10000 -y 0 -c 1 -C 1 mnesia: 36s khepri: 227s

  2. perf-test -p -e amq.topic -t topic -qp q-%d -qpf 1 -qpt 10000 -y 0 -c 1 -C 1 --skip-binding-queues mnesia: 27s khepri: 127s

Nothing clearly stands out (flamegraph captured for 200s during the first of the above tests), unless khepri somehow makes the rabbit_mgmt_metrics_collector slower. issue-5101

For comparison, 20s captured when running against Mnesia: mnesia

mkuratczyk avatar Sep 11 '23 13:09 mkuratczyk

Diana pointed out this is not a valid test. perf-test default to transient queues, which are not supported in 4.0. After adding -ad false -f persistent, Mnesia takes 245s and 133s respectively, so in fact it's a bit slower than khepri.

I'll close this issue and perhaps raise a new one after testing how exclusive queues are handled. In some cases users may have a lot of exclusive queues (eg. MQTT QoS0), so we should be able to handle many exclusive queues quickly.

mkuratczyk avatar Sep 11 '23 14:09 mkuratczyk