cqerl
cqerl copied to clipboard
Add support for Token Aware policy
We do many writes to Cassandra using your driver. We would love to use the TokenAware policy, where the driver routes the request to the primary node where the partition is stored. More information is here:
http://datastax.github.io/ruby-driver/features/load_balancing/token_aware/
Thanks
I'm currently working on implementing the different levels of policies for load-balancing and failover, as other drivers are doing. Stay tuned
Nice! Let us know if you need a beta user.
+1 on this. There are huge performance implications by using token aware routing on very large clusters/systems.
As I understand it, it's a pretty large improvement to the driver. We need to:
- [ ] Automatically detect all members of a cassandra cluster. This basically means reusing the cluster concept we have, but querying some node for peers information, and updating the list of member nodes accordingly.
- [ ] Keep track of the set of tokens, for each member node. This allows us to select the right node for a certain routing key
- [ ] Listen for cassandra events about topology changes to keep the two above up-to-date, at all time. This has been tracked by #15 for a long time, but I never got around to implement it fully.
Hello @matehat
Maybe you can take a look on the erlcass documentation and implement something similar with what datastax did (https://github.com/silviucpp/erlcass is just a wrapper in front of this). Basically they have the following policies :
load_balance_round_robin
Example: {load_balance_round_robin, true}
Configures the cluster to use round-robin load balancing. The driver discovers all nodes in a cluster and cycles through them per request. All are considered 'local'.
load_balance_dc_aware
Example: {load_balance_dc_aware, {"dc_name", 2, true}}
Configures the cluster to use DC-aware load balancing. For each query, all live nodes in a primary 'local' DC are tried first, followed by any node from other DCs. Note:
This is the default, and does not need to be called unless switching an existing from another policy or changing settings. Without further configuration, a default local_dc is chosen from the first connected contact point, and no remote hosts are considered in query plans. If relying on this mechanism, be sure to use only contact points from the local DC. Params:
{load_balance_dc_aware, {LocalDc, UsedHostsPerRemoteDc, AllowRemoteDcsForLocalCl}}
LocalDc - The primary data center to try first
UsedHostsPerRemoteDc - The number of host used in each remote DC if no hosts are available in the local dc
AllowRemoteDcsForLocalCl - Allows remote hosts to be used if no local dc hosts are available and the consistency level is LOCAL_ONE or LOCAL_QUORUM
token_aware_routing
Example: {token_aware_routing, true}
Configures the cluster to use token-aware request routing, or not. This routing policy composes the base routing policy, routing requests first to replicas on nodes considered 'local' by the base load balancing policy.
Default is true (enabled).
latency_aware_routing
Example:
{latency_aware_routing, true}
{latency_aware_routing, {true, {2.0, 100, 10000, 100 , 50}}}
Configures the cluster to use latency-aware request routing, or not. This routing policy is a top-level routing policy. It uses the base routing policy to determine locality (dc-aware) and/or placement (token-aware) before considering the latency. Params:
{Enabled, {ExclusionThreshold, ScaleMs, RetryPeriodMs, UpdateRateMs, MinMeasured}}
Enabled : State of the future
ExclusionThreshold : Controls how much worse the latency must be compared to the average latency of the best performing node before it penalized.
ScaleMs Controls the weight given to older latencies when calculating the average latency of a node. A bigger scale will give more weight to older latency measurements.
RetryPeriodMs - The amount of time a node is penalized by the policy before being given a second chance when the current average latency exceeds the calculated threshold (ExclusionThreshold * BestAverageLatency).
UpdateRateMs - The rate at which the best average latency is recomputed.
MinMeasured - The minimum number of measurements per-host required to be considered by the policy.
Defaults: {false, {2.0, 100, 10000, 100 , 50}}
Idea is that you can combine them . For example in production I have 3 clusters . One with SSD's one with a bit slowest HDD and one used for analytics (spark and solr are there as well). And I configured the erlcass driver to use realtime cluster if it's available as main source and fallbacks on all the other in case something is wrong or there are not enough replicas to meet the consistency level in the main cluster.
Silviu
Progress on this is happening on the v2.0-pre branch