dragonfly
dragonfly copied to clipboard
feat(server): Pubsub updates with RCU
Implements RCU (read-copy-update) for updating the centralized channel store.
Contrary to old mechanism of sharding subscriber info across shards, a centralized store allows avoiding a hop for fetching subscribers. In general, it only slightly improves the latency, but in case of heavy traffic on one channel it allows "spreading" the load, as the single shard no longer is a bottleneck, thus increasing throughput by multiple times.
Benchmarks:
1. Dry run without subscribers
OLD
===================================================================================================
Type Ops/sec Avg. Latency p50 Latency p99 Latency p99.9 Latency KB/sec
---------------------------------------------------------------------------------------------------
Publishs 955759.86 0.75294 0.71100 1.32700 4.57500 43709.23
Totals 955759.86 0.75294 0.71100 1.32700 4.57500 43709.23
NEW
===================================================================================================
Type Ops/sec Avg. Latency p50 Latency p99 Latency p99.9 Latency KB/sec
---------------------------------------------------------------------------------------------------
Publishs 1040942.05 0.69134 0.67900 1.07900 3.75900 47604.77
Totals 1040942.05 0.69134 0.67900 1.07900 3.75900 47604.77
2. Run with subscribers
OLD
===================================================================================================
Type Ops/sec Avg. Latency p50 Latency p99 Latency p99.9 Latency KB/sec
---------------------------------------------------------------------------------------------------
Publishs 399898.33 1.79988 1.37500 7.00700 11.64700 18288.39
Totals 399898.33 1.79988 1.37500 7.00700 11.64700 18288.39
NEW
===================================================================================================
Type Ops/sec Avg. Latency p50 Latency p99 Latency p99.9 Latency KB/sec
---------------------------------------------------------------------------------------------------
Publishs 418730.46 1.71866 1.21500 6.97500 10.81500 19149.67
Totals 418730.46 1.71866 1.21500 6.97500 10.81500 19149.6
3. Single channel
OLD
===================================================================================================
Type Ops/sec Avg. Latency p50 Latency p99 Latency p99.9 Latency KB/sec
---------------------------------------------------------------------------------------------------
Publishs 88210.54 9.06565 8.76700 15.10300 22.01500 4048.73
Totals 88210.54 9.06565 8.76700 15.10300 22.01500 4048.73
NEW
===================================================================================================
Type Ops/sec Avg. Latency p50 Latency p99 Latency p99.9 Latency KB/sec
---------------------------------------------------------------------------------------------------
Publishs 236476.81 3.38045 1.13500 27.39100 35.32700 10853.92
Totals 236476.81 3.38045 1.13500 27.39100 35.32700 10853.92
Please adopt a habit of adding additional info to the commit description. It's not a one-liner PR.
Fixed.
My only concern is around Apply() bottleneck
Yes, I didn't yet optimize it - we should squash parallel updates in the future.