twemproxy icon indicating copy to clipboard operation
twemproxy copied to clipboard

Question about consistent hashing

Open willshulman opened this issue 11 years ago • 8 comments
trafficstars

Let's say I want to use nutcracker as a redis proxy for a set of persistent-less redis nodes that I will use as a cache. Let's also say I use a consistent hashing scheme.

My understanding is that if a node leaves the cluster the hashing algorithm will result in keys that were on the original node eventually being stored elsewhere. But what happens when the original node comes back (let's say it was just a temporary network partition)? How does nutcracker now deal with the fact that there may be the same key more than one node in the ring?

willshulman avatar Aug 08 '14 17:08 willshulman

It doesn't deal with that. Its a cache, so the data will eventually expire in the old node and you will need to get the data from the main source to populate the new node.

mezzatto avatar Aug 08 '14 17:08 mezzatto

But even if the data on the old node will expire eventually, once the old node comes back to the cluster (along with its data still in memory since it was never down but just partitioned from the network) how will nutcracker know where to route requests for a key that has a value residing on both the old node and another node? Will it just use the consistent hashing algorithm, find the data on the old node now that it is back, leaving the newer value on the alternate node to expire or LRU out?

willshulman avatar Aug 08 '14 18:08 willshulman

Hi, I just wanted to bump this to see if anyone can confirm the behavior in this scenario

willshulman avatar Aug 14 '14 06:08 willshulman

In brain split. There is a hole. Generally, tmproxy tries to reconnect cache servers. So there is some possiblity to read old cache from cache server in brain split.

Ex) there are two server A and B Write 1 to B Brain split So write 1 to A Update 1 solve brain split Read 1 from A(this is old cache)

charsyam avatar Aug 14 '14 06:08 charsyam

Got it. Thank you, this is exactly the scenario I am describing.

Has there ever been any discussion of adding a feature that allows one to configure the system so that when shards are cleared of all cache data as they rejoin the set? I can see this being useful in scenarios where one would prefer cache misses over reading old values.

-will

willshulman avatar Aug 14 '14 17:08 willshulman

Hi Will,

Any news on this issue ? I am about to do some testing whereby it has been reported that a redis node when re-booted leaves the twemproxy members set and never joins again. I wonder if this is similar ?

We are using cloud based systems and so a reboot can take some time.

Our config looks currently something like this:

twemproxy::resource::nutcracker4 { 'redis-twemproxy':
          port                 => '7379',
          nutcracker_hash      => 'fnv1a_64',
          nutcracker_hash_tag  => '{}',
          distribution         => 'ketama',
          twemproxy_timeout    => '300',

          auto_eject_hosts     => true,
          server_retry_timeout => '500',
          server_failure_limit => '3',

          members              =>  [
... etc
          ] 
        }    

ghost avatar May 05 '15 13:05 ghost

+1 If server_retry_timeout can be set to an infinite value or support reload config / graceful restart will be grateful.

axot avatar Feb 04 '16 02:02 axot

https://github.com/twitter/twemproxy/issues/608 : ways are planned to avoid these inconsistencies

  1. Redis sentinel support will be part of an upcoming release allowing this to automatically fail over to a replica server
  2. A separate failover pool is planned for memcached

TysonAndre avatar Aug 06 '21 14:08 TysonAndre