python-memcached icon indicating copy to clipboard operation
python-memcached copied to clipboard

Consistent Hashing - Ketama

Open haridas opened this issue 10 years ago • 14 comments

Hi,

I have the working implementation of ketama consistent hashing algorithm on top of your library. May I send you pull request ? Or you are keeping the library as minimal as possible ? Please let me know, I would love to share my implementation.

Warm Regards, Haridas N.

haridas avatar Oct 21 '14 15:10 haridas

I would really like to have Ketama in this library as well. Haridas, is your implementation available somewhere that I can get it?

sergio97 avatar Mar 19 '15 17:03 sergio97

Thanks for showing interest in adding this feature into the library. I will send you a pull request with Ketama implementation done on it.

haridas avatar Mar 22 '15 05:03 haridas

Hi,

I committed changes on the forked version of this repository. This is the initial version of changes to see how it works and how I was used it before in our environment. Here is the link - https://github.com/haridas/python-memcached/commit/18ac1a8e5aa83b276ea5b9702e5f38ecae4e57e7.

There is no test case added to this implementation, since the testcase setup required multiple memcache server, I'm holding on it. Let me know better methods to handle this scenario.

Here is separate gist of the same implementation with testcase. If there is 8 memcachd server running on your machine, then you can test this script by running python new_memcached.py.

Gist link - https://gist.github.com/e7db5d28536fab4ebe52.git

Looking forward to your comments on the implementation.

Hari

haridas avatar Mar 24 '15 03:03 haridas

Thank you Haridas! I will be making a few tweaks to your code before I use it. In case you want my changes I'll open a pull request to your fork when I'm finished.

sergio97 avatar Mar 25 '15 18:03 sergio97

Cool ! I would like to see the new changes that you make on the codebase, please update this issue or via PR also fine for me. Thank you.

haridas avatar Mar 26 '15 16:03 haridas

I had forgotten to reference this in the PR. Please have a look :)

sergio97 avatar Mar 27 '15 02:03 sergio97

Awesome. Thanks for making the code much better. Hoping to getting it merged soon.

haridas avatar Mar 27 '15 08:03 haridas

I'm not so happy with the structure of it at the moment. I think it would be better to create a ketama client like so:

import memcache

mc = memcache.Client('127.0.0.1', ketama=True)

or:

import memcache

mc = memcache.Client('127.0.0.1', placement="ketama") # default is "modula"

I'm not sure exactly how to structure the code to achieve this and I just wanted to get it working for now. Combining the two classes into 1 would be messy. I was thinking to pull out "key placement" logic into its own class, the same way that servers are their own _Host class.

Any input?

sergio97 avatar Mar 27 '15 14:03 sergio97

I was thinking the same thing before making this commit. We can achieve this by use of Meta class. So the overloaded methods can be injected based on the client type ( ketma or normal) on the class creation time itself. By default the client type will be normal.

But then I thought, existing library is very simple to use and flat in structure, and adding this feature by injecting everything in to the Client class would increase the complexity to maintain and use it. So separately shipping an new client is a good trade off for library users and library maintainers.

haridas avatar Mar 29 '15 09:03 haridas

Glad to see this might actually happen!

Around line 1508, I was wondering if bisect.bisect_left might be a better choice, i.e.

slot = bisect.bisect_left(self._ketama_server_slots, h_key)
if slot < len(self._ketama_server_slots):
    server = self._ketama_server_ring[slot]
    if server.connect():
        return (server, key)

bisect.bisect_left will run in log(n) time.

ehrmann avatar Mar 31 '15 23:03 ehrmann

Its very good suggestion. I never came across the bisect module. I think it internally does binary search to locate the position, hence the log(n) time complexity.

I will surely update my commit.

Thank you, Hari.

On Wed, Apr 1, 2015 at 4:31 AM, David Ehrmann [email protected] wrote:

Glad to see this might actually happen!

Around line 1508 https://github.com/haridas/python-memcached/commit/18ac1a8e5aa83b276ea5b9702e5f38ecae4e57e7#diff-894c02c873c7b32549152655db8797a8R1508, I was wondering if bisect.bisect_left might be a better choice, i.e.

slot = bisect.bisect_left(self._ketama_server_slots, h_key) if slot < len(self._ketama_server_slots): server = self._ketama_server_ring[slot] if server.connect(): return (server, key)

bisect.bisect_left will run in log(n) time.

— Reply to this email directly or view it on GitHub https://github.com/linsomniac/python-memcached/issues/61#issuecomment-88278207 .

Have a nice day. Haridas N.

haridas avatar Apr 01 '15 02:04 haridas

@sergio97 Can you please send a PR from your fork to my fork's master ? I would love to have your changes on my fork also :)

haridas avatar Apr 01 '15 05:04 haridas

Since you were asking about testing it, one way would be to restructure the code so you inject a Sharder to the Client rather than inheriting and overriding. The benefit is that you can test the Ketama implementation separately, outside of the client code. But this is also a very "Java" way of doing things.

ehrmann avatar Apr 01 '15 06:04 ehrmann

I was wondering if any consistent hashing functionality ever made it in? Browsing the source it looks like server selection is still using simple modulo assignment, per https://github.com/linsomniac/python-memcached/blob/bad41222379102e3f18f6f2f7be3ee608de6fbff/memcache.py#L432

fluffy-critter avatar Nov 28 '19 21:11 fluffy-critter