[NEW] New Command field classification.
The problem/use-case that the feature addresses
Modern Valkey clients use the information returned by the "COMMAND INFO" command to determine how to handle the fields of a command. In particular, the determination of which fields contain a key is critical to allowing the client to properly route commands to the correct shards in a CME cluster.
Commands which access search indexes are optimally handled by a routing algorithm that cannot be described with the currently available options (not_a_key comes the closest). This leads to sub-optimal (unbalanced) cluster performance or requires that application programmers assume routing responsibility -- something which the community has learned is hard to get right.
Description of the feature
The request on the server side is to add one more option for how to handle a command field -- Index. Tagging a field as an Index notifies the modern client to following the routine algorithm described below.
Soon, the Search module will have two types of indexes: cluster-wide and single-slot. Each has a different client-side routing algorithms, respectively. Clients are easily able to determine which algorithm to use by inspecting the syntax of the field contents -- no client-side state is required. If a Valkey hashtag is present, this is a single-slot index. If no hashtag is present, the the index is cluster-wide.
*** Cluster-wide Index Routing ***
Cluster-wide indexes are present on every node in the system. A command can be sent to any shard for execution. Naturally, the node on which the command lands will perform more work than the other nodes in the cluster, leading to the obviously desired feature which is that the client should distribute these commands across the shards in order to balance the load. No particular load balancing algorithm is required. The client is free to use any load-balance algorithm, e.g., random, round-robin, etc.
*** Single-Slot Index Routing ***
Single Slot indexes are present only on a single shard. Clients can route commands to single-slot indexes by treating the index field just like a key.
Note that the description above is shard oriented. Route selection within a shard must honor the READONLY/READWRITE status of the issuing connection.
@allenss-amazon I read your suggestion. some small comments:
Clients are easily able to determine which algorithm to use by inspecting the syntax of the field contents -- no client-side state is required. If a Valkey hashtag is present, this is a single-slot index. If no hashtag is present, the the index is cluster-wide.
How is that different than the current client routing logic? I mean, with regular keys a client might route the command to a specific shard and potentially get a MOVED error. In case of per-slot index, would we have the same behavior of responding with a MOVED error? If we will why is it important that the client would base his routing logic on the existence of a hash tag and not just try and apply it's own logic in order to decide on the routing?
The request on the server side is to add one more option for how to handle a command field -- Index. Tagging a field as an Index notifies the modern client to following the routine algorithm described below.
What is the actual request here? is it to just add a command flag or something more similar to the first_key, Last key and step? I think adding a flag is probably easier and potentially be less breaking change, but it sounds like you suggest to explicitly be able to target a subset of arguments right?
@allenss-amazon I read your suggestion. some small comments:
Clients are easily able to determine which algorithm to use by inspecting the syntax of the field contents -- no client-side state is required. If a Valkey hashtag is present, this is a single-slot index. If no hashtag is present, the the index is cluster-wide.
How is that different than the current client routing logic? I mean, with regular keys a client might route the command to a specific shard and potentially get a MOVED error. In case of per-slot index, would we have the same behavior of responding with a MOVED error? If we will why is it important that the client would base his routing logic on the existence of a hash tag and not just try and apply it's own logic in order to decide on the routing?
For the per-slot index, you are correct, a MOVED message would provide the same correct functionality. However, it would be the wrong functionality for a cluster-wide index. In the cluster-wide case the standard functionality would yield the worst case which is that 100% of the query traffic being routed to the same shard, preventing the load balancing of the dispatch/merge part of the operation. The COMMAND INFO data structures are on a per-command basis, and don't provide the mechanism to distinguish between these two situations.
The request on the server side is to add one more option for how to handle a command field -- Index. Tagging a field as an Index notifies the modern client to following the routine algorithm described below.
What is the actual request here? is it to just add a command flag or something more similar to the first_key, Last key and step? I think adding a flag is probably easier and potentially be less breaking change, but it sounds like you suggest to explicitly be able to target a subset of arguments right?
Today, the command info identifies keys and non-keys. I'm saying there's a third type of command argument: indexes. As described above, indexes need routing that's content-dependent.