flexihash icon indicating copy to clipboard operation
flexihash copied to clipboard

Consistent hashing question

Open dliebner opened this issue 4 years ago • 5 comments

This isn't an issue with the library, just a question about how consistent hashing works with weights. If I change the weight of a target by some small amount, let's say out of 10 targets each with a weight of 1, I change the weight of one target to 3. How significantly will the keys be redistributed?

dliebner avatar Dec 01 '21 06:12 dliebner

Here's the relevant code: https://github.com/pda/flexihash/blob/8403c28a5453146c391992dc25cc364df365a8f5/src/Flexihash.php#L95-L100

Based on that, and some very hazy memory:

For a given weight w, that target is w times more likely to be picked compared to w = 1. So in your example, the regular targets have 1 / 12 = 8.3% chance, and the boosted one has 3/12 = 25% chance. (The denominator is 12 because 9 targets of weight=1 plus 1 target of weight=3). You can use non-integer weight for more subtle distribution, like 1.5.

pda avatar Dec 01 '21 08:12 pda

Thanks! So a higher weight => more distributions on the ring, as chosen by $target.$i, so the original placements will remain the same, only new ones will be added, "stealing" keys from other targets in a fairly distributed manner. Am I understanding that correctly?

dliebner avatar Dec 01 '21 16:12 dliebner

This library is not really about "Consistent hash", read more https://en.wikipedia.org/wiki/Rendezvous_hashing

R-omk avatar Dec 01 '21 17:12 R-omk

FYI in case this is helpful, this was the original inspiration / reference material for the library 14 years ago:

https://web.archive.org/web/20080331024845/http://www.spiteful.com/2008/03/17/programmers-toolbox-part-3-consistent-hashing/

This is the part that translated to $weight in flexihash:

You can tune the amount of load you send to each server based on that server’s capacity. Imagine this spatially – more points for a server means it covers more of the ring and is more likely to get more resources assigned to it.

Mostly I found the diagrams in that blog post really helpful in forming a mental model.

pda avatar Dec 02 '21 02:12 pda

Thanks for that! I plan to implement a bounded-load layer on top of this once it becomes necessary. For anyone else out there, here are the resources I drew from when learning about consistent hashing: https://www.toptal.com/big-data/consistent-hashing https://medium.com/vimeo-engineering-blog/improving-load-balancing-with-a-new-consistent-hashing-algorithm-9f1bd75709ed https://ai.googleblog.com/2017/04/consistent-hashing-with-bounded-loads.html

dliebner avatar Dec 02 '21 02:12 dliebner