dragonfly
dragonfly copied to clipboard
recognizing hot keys
Motivation: https://blog.box.com/introducing-memsniff-robust-memcache-traffic-analyzer
for large scale deployments, caching teams would like to learn about hot keys in real-time so that they could handle them in a special way.
Currently, teams develop sniffers! (see the link) to do so. It's not very elegant way and very CPU intensive. We could integrate it into DF and provide native support for this.
Algorithms that might help: https://www.usenix.org/conference/atc18/presentation/gong https://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-get.cgi/2016/CS/CS-2016-01.pdf
we should choose the simpler one, not necessarily the most sophisticated one.
after reading these paper I think it's better to start with HeavyKeeper. We even do not need to maintain a min-heap data structure because query complexity is not the issue here - we need to focus on fast updates.
I have read the heavykeeper paper and am implementing a heavykeeper structure that does not use min-heap.
The basic idea for dragonfly to introduce hotspot awareness is to set one heavykeeper per proactor and perform sorting to find the hottest few keys when hotspot information is needed.
I intend to do the following. step1. complete and test a heavykeeper structure without min-heap (work in progress) step2. think about how heavykeeper is used in each proactor of dragonfly? step3. what can each proactor do when it senses a hotspot? step4. How to aggregate hotspot information from all proactors when dragonfly users need hotspot information? step5. do we need to persist hotspot information to provide hotspot history query function?
First I will focus on the implementation and testing of heavykeeper, and continue to dive into dragonfly's code, subsequent questions may need to be discussed by community members.
This is not something that can be done quickly, so I think these ideas will appear in many PRs that
@Super-long please join our Discord server https://discord.gg/HsPjXGVH85 and say hello. I will add you to our #dev channel.