Add Ketama Support
Resolves issue #61. Based on haridas' implementation, I've made a few tweaks. This is tested against 1 and many servers, including failing servers and bringing them back up. As far as I can tell everything is working. All tests run without issue.
If you have any suggests for improvement I'd be happy to work on it.
glad to see PR about ketama consistent hashing support. However with looking at the PR i have some thoughts.
- i think we should add more hash algorithms options instead of just crc32(which is mainly used to do data integerty.) i remember that i have find some comments about using crc32 as a hashing algorithm in implementing a memcached client for Java(sorry for not remembering the comments url, if you don't mind, you can Google it.).
- i think we should use a binary tree to find the correct server node to get the cached value. I have recently see this in Spymemcached, which is a memcached client for Java. Some code on this
Andyxning, I've pushed a branch in my repo with Ketama using binary search instead. I didn't commit to master because in my initial testing I found it ran a bit slower than using linear search, which confuses me. The patch is very small so please have a look. I think crc32 is good enough, but if you want to try using md5 or sha instead I would be more than happy to review a pull request from you, or anyone who wants to try it :)
@sergio97 , ok, i will check it out as soon as i have some spare time.
I didn't commit to master because in my initial testing I found it ran a bit slower than using linear search, which confuses me.
Linear search often outperforms binary search on small data sets because of branch mispredictions.
Maybe it it right. However, we can not assume that people using our python memcached client to only setup in a small sets, right? :)
++++++ Ning Xie
2015-04-26 23:24 GMT+08:00 David Ehrmann [email protected]:
I didn't commit to master because in my initial testing I found it ran a bit slower than using linear search, which confuses me.
Linear search often outperforms binary search on small data sets because of branch mispredictions.
— Reply to this email directly or view it on GitHub https://github.com/linsomniac/python-memcached/pull/66#issuecomment-96399426 .
@andyxning You're completely right :) I wasn't sure the exact cause of the slowdown, which is why I didn't merge to master. If @ehrmann is correct (which seems very likely, I tested with only 4 servers), then I will merge the change.
Hi,
If this subject is still of any interest to you guys, I'd like to point out that I implemented a monkey patching of python-memcache on uhashring.
uhashring is a ketama compatible, full featured consistent hashing python lib (and if you care more for performance, it comes with a faster hashing algo).
My 2 cents (and subjective view)
Nice, thank you @ultrabug. Your monkey patch is actually very simple. I'm almost jealous I didn't come up with it :p I don't have time to compare the performance of this vs other options right now. I would most like to compare the performance of this vs the library I started writing (but since had to set aside): https://github.com/breqwatr/python-memcache-unified. I like how you made HashRing.range() a generator - the concern of placement vs the concern of checking if a server is alive are well separated.
@sergio97 glad you like it. I've done a benchmark which you can see on the README.
Feel free to comment / criticize and propose anything you'd see fit. Cheers
Any updates on this? I'd love to see consistent hashing with this memcache client.
@ehrmann will update this PR later. Would you like to make a review?
Sure. It looks like it still needs some of the issues resolved (crc32, there's TODOs in the PR, etc)
@andyxning please take over this work if you want to - I am no longer interested in completing this. If you create your own PR I can close this one.