cqerl icon indicating copy to clipboard operation
cqerl copied to clipboard

Add support for Token Aware policy

Open switzer opened this issue 10 years ago • 6 comments

We do many writes to Cassandra using your driver. We would love to use the TokenAware policy, where the driver routes the request to the primary node where the partition is stored. More information is here:

http://datastax.github.io/ruby-driver/features/load_balancing/token_aware/

Thanks

switzer avatar Feb 07 '15 12:02 switzer

I'm currently working on implementing the different levels of policies for load-balancing and failover, as other drivers are doing. Stay tuned

matehat avatar Feb 08 '15 20:02 matehat

Nice! Let us know if you need a beta user.

switzer avatar Feb 09 '15 16:02 switzer

+1 on this. There are huge performance implications by using token aware routing on very large clusters/systems.

Cidan avatar May 13 '16 23:05 Cidan

As I understand it, it's a pretty large improvement to the driver. We need to:

  • [ ] Automatically detect all members of a cassandra cluster. This basically means reusing the cluster concept we have, but querying some node for peers information, and updating the list of member nodes accordingly.
  • [ ] Keep track of the set of tokens, for each member node. This allows us to select the right node for a certain routing key
  • [ ] Listen for cassandra events about topology changes to keep the two above up-to-date, at all time. This has been tracked by #15 for a long time, but I never got around to implement it fully.

matehat avatar May 16 '16 13:05 matehat

Hello @matehat

Maybe you can take a look on the erlcass documentation and implement something similar with what datastax did (https://github.com/silviucpp/erlcass is just a wrapper in front of this). Basically they have the following policies :

load_balance_round_robin

Example: {load_balance_round_robin, true}

Configures the cluster to use round-robin load balancing. The driver discovers all nodes in a cluster and cycles through them per request. All are considered 'local'.

load_balance_dc_aware

Example: {load_balance_dc_aware, {"dc_name", 2, true}}

Configures the cluster to use DC-aware load balancing. For each query, all live nodes in a primary 'local' DC are tried first, followed by any node from other DCs. Note:

This is the default, and does not need to be called unless switching an existing from another policy or changing settings. Without further configuration, a default local_dc is chosen from the first connected contact point, and no remote hosts are considered in query plans. If relying on this mechanism, be sure to use only contact points from the local DC. Params:

{load_balance_dc_aware, {LocalDc, UsedHostsPerRemoteDc, AllowRemoteDcsForLocalCl}}

LocalDc - The primary data center to try first
UsedHostsPerRemoteDc - The number of host used in each remote DC if no hosts are available in the local dc
AllowRemoteDcsForLocalCl - Allows remote hosts to be used if no local dc hosts are available and the consistency level is LOCAL_ONE or LOCAL_QUORUM

token_aware_routing

Example: {token_aware_routing, true}

Configures the cluster to use token-aware request routing, or not. This routing policy composes the base routing policy, routing requests first to replicas on nodes considered 'local' by the base load balancing policy.

Default is true (enabled).

latency_aware_routing

Example:

{latency_aware_routing, true}
{latency_aware_routing, {true, {2.0, 100, 10000, 100 , 50}}}

Configures the cluster to use latency-aware request routing, or not. This routing policy is a top-level routing policy. It uses the base routing policy to determine locality (dc-aware) and/or placement (token-aware) before considering the latency. Params:

{Enabled, {ExclusionThreshold, ScaleMs, RetryPeriodMs, UpdateRateMs, MinMeasured}}

Enabled : State of the future
ExclusionThreshold : Controls how much worse the latency must be compared to the average latency of the best performing node before it penalized.
ScaleMs Controls the weight given to older latencies when calculating the average latency of a node. A bigger scale will give more weight to older latency measurements.
RetryPeriodMs - The amount of time a node is penalized by the policy before being given a second chance when the current average latency exceeds the calculated threshold (ExclusionThreshold * BestAverageLatency).
UpdateRateMs - The rate at which the best average latency is recomputed.
MinMeasured - The minimum number of measurements per-host required to be considered by the policy.

Defaults: {false, {2.0, 100, 10000, 100 , 50}}

Idea is that you can combine them . For example in production I have 3 clusters . One with SSD's one with a bit slowest HDD and one used for analytics (spark and solr are there as well). And I configured the erlcass driver to use realtime cluster if it's available as main source and fallbacks on all the other in case something is wrong or there are not enough replicas to meet the consistency level in the main cluster.

Silviu

silviucpp avatar May 16 '16 13:05 silviucpp

Progress on this is happening on the v2.0-pre branch

matehat avatar Jul 22 '16 13:07 matehat