rumqtt icon indicating copy to clipboard operation
rumqtt copied to clipboard

Store outgoing_pub in a HashMap for more memory efficient storage of unassigned pkids

Open ambaxter opened this issue 1 year ago • 5 comments

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist:

  • [x] Formatted with cargo fmt
  • [x] Make an entry to CHANGELOG.md if it's relevant to the users of the library. If it's not relevant mention why.

I wrote an MQTT traffic simulator and noticed that creating many clients within a single process consumes a large amount of memory. The below example is for 1000 clients.

mqtt_before

This is due to rumqttc pre-allocating MqttState.outgoing_pub for each client. I replaced the Vec with a HashMap and dramatically reduced memory usage.

mqtt_after

I know that MqttState.outgoing_pub is most likely pre-allocated for low powered devices. Would it be possible to make using a HashMap either a runtime configuration or feature flag?

ambaxter avatar Jun 17 '24 12:06 ambaxter

By any chance does the code with which you tested these scenarios use QoS 1+?

Seems like the HashMap is saving space as it never gets utilized(which is the case in QoS 0). That shouldn't be the case when we end up using QoS 1+ where allocations will take up some cycles as the map size increases slowly.

de-sh avatar Jun 17 '24 12:06 de-sh

you can also configure how much you want to preallocate using MqttOptions, let's say you set max_inflight to be 10 to save memory, it won't allocate more than that.

swanandx avatar Jun 17 '24 13:06 swanandx

By any chance does the code with which you tested use QoS 1+?

Seems like the HashMap is saving space as it never gets utilized(which is the case in QoS 0). That shouldn't be the case when we end up using QoS 1+ where allocations will take up some cycles as the map size increases slowly.

The simulation I've developed only uses QoS 1 &2. The HashMap should only be as large as the number of Publish entries waiting on PubRec or PubAck. Each HashMap could very well grow to be u16::MAX in length.

My test-case is currently running at 10,000 MQTT clients at ~1500 messages/s for the entire simulation. The simulator's RAM hasn't breached 800 MB in almost an hour of processing.

ambaxter avatar Jun 17 '24 13:06 ambaxter

you can also configure how much you want to preallocate using MqttOptions, let's say you set max_inflight to be 10 to save memory, it won't allocate more than that.

True, but in many cases you won't know what the correct value should be just based off of traffic size alone. There are many variables to consider.

ambaxter avatar Jun 18 '24 08:06 ambaxter

IMHO we can use VecDeque to save pkid/publish and NoticeTx, the reasons:

  1. Since the packets are always ordered, the push_back and pop_front can be used, they work faster than insert and push of Vec.
  2. The memory usage is less than HashMap

Some of the proposed changes are included here: https://github.com/bytebeamio/rumqtt/commit/68fcb73b1d736449626e39d5528d74dbad82287e#diff-35dca8779a1d08a58f298e7f88a1ca9d2c32f79e97024252a6366e4a2f32325a

xiaocq2001 avatar Jul 04 '24 03:07 xiaocq2001