hmq
hmq copied to clipboard
Elastic Message pool and Worker pool?
I was thinking of introducing some logic to dynamically size the message pool and the worker pool. Something like: the pool start at a low number and when it's not far from being full, increase the size of the pool. The logic of decreasing to pool would be the same. What do you think about it? Advantages I can think of:
- quicker startup
- lower the memory footprint
- adapt automatically to the computer capacity it's running on
- less things to worry about in configuration file
yes, I agree with you, but we need a mechanism to judge the magnitude of the message flow
@MarcMagnin what about workpool ? would help review the branch "pool" ?
@chowyu08 yey workpool seems very nice! I think you can merge the branch "pool".
@MarcMagnin I need do more test, the performance I tested today seems worse than berfore.
Oh that's weird. I was expecting the performances to be roughly the same with a lower memory footprint and quicker startup. How are you testing the performances? A good read: https://github.com/golang/go/wiki/Performance
@MarcMagnin m test tool is mqtt_branchmark , and compare the througput.
Nice! There is that one as well that I'd like to test at some point: https://github.com/takanorig/mqtt-bench
I've been looking in with more details and I think I we will not get better results with this workerpool as it raise a go routine when receiving a task. I'm thinking of reusing the current workerpool logic with a small number of fixed workers and extra workers with a TTL.
I've implemented sync.Pool and it seems to behave nicely, then we don't have to worry at all about the workers and their TTL. I'll create a branch when I have a bit more time
After many tries I've ended with those results:
BenchmarkDispatcherPoolMutex-3 2000000 645 ns/op 2 B/op 0 allocs/op
BenchmarkDispatcherPoolHalfChannel-3 3000000 595 ns/op 0 B/op 0 allocs/op
BenchmarkDispatcherPoolFullChannel-3 2000000 861 ns/op 1 B/op 0 allocs/op
BenchmarkDispatcherOriginal-3 2000000 837 ns/op 4 B/op 0 allocs/op
The original performs really well but it has static assignments to the number of workers / chan etc. The advantage of the sync.Pool is big as we don't need to worry about the life cycle of a worker. The perfs are a bit lower during the bench with a channels for all communication but I presume it's because there only one message queue. This can be optimized by removing some of the channels down to using Mutexes. I'll create a branch with BenchmarkDispatcherPoolHalfChannel when I'll have a bit of time as this one is elegant and performs really well.
could we use a ringbuf ?
I've already tested with an implementation (gringo) but it's not designed for infinitely running goroutine as it will use 100% of the CPU while waiting for more work. Another implementation of a "ring buffer" of go routines will be based on channels so we will not have any advantages over sync.Pool. sync.Pool can be seen as a BlockingCollection of WeakReference objects (in C# paradigm) so we don't have to worry anymore about allocating/dis-allocating workers which is a big win:
- No waste of memory
- Very efficient
- Simpler code
Interesting post about it: https://stackoverflow.com/questions/38505830/how-to-implement-memory-pooling-in-golang
Any item stored in the Pool may be removed automatically at any time without notification. If the Pool holds the only reference when this happens, the item might be deallocated.
But as we don't store something we care about into the Pool (basically just a go routine), that's a nice fit.
nice
@chowyu08 I've added the logic in sync_pool branch. On top of those changes I've removed ClusterPool chan as I wasn't sure it was required + renamed brokerLog to log as it's easier to keep a simpler name. If we want to specify another log source we just have to put the .go into another package and do something like var log = logger.Get().Named("Client")
Let me know how it goes and how are the perfs. I'll try the tools you are using at some point.
sync.pool is nice, I will do some performance test, have you done some test between the two branch ?
No I'm afraid I haven't had time to do perf tests against the MQTT but at least the bench says we should be good.
hey @chowyu08, have you been able to check perfs? How is it looking? Regards