Allow keyless commands to return MOVED response
Valkey Search module introduces new commands (FT.CREATE , FT.SEARCH, and others) that operate on indexes instead of keys (see commands). Indexes are global, and Valkey search provides the required machinery to distribute the index metadata to all nodes in the cluster, and collect query results from all nodes. The Search module commands are currently decorated with command flags write and readonly (see module loader).
However, since these commands are not associated to keys, they don’t automatically return a MOVED response when executed on replica nodes in cluster-mode enabled Valkey clusters. I opened 2 issues in Valkey Search repo to report this issue for FT.SEARCH (and other readonly commands) and FT.CREATE (and other write commands). Here is an example of the FT.CREATE command behavior:
Current behavior:
127.0.0.1:6379> FT.CREATE json_idx1 ON JSON PREFIX 1 json: SCHEMA $.vec AS VEC VECTOR HNSW 6 DIM 2 TYPE FLOAT32 DISTANCE_METRIC L2
(error) READONLY You can't write against a read only replica
Expected behavior
127.0.0.1:6379> FT.CREATE json_idx1 ON JSON PREFIX 1 json: SCHEMA $.vec AS VEC VECTOR HNSW 6 DIM 2 TYPE FLOAT32 DISTANCE_METRIC L2
(error) MOVED <hash slot> <primary node ip:port>
Since this behavior is controlled by the Valkey core, Valkey Search module would require new functionality (for example additional command flags) to enable command routing via MOVED response for keyless commands.
Similar issue to the one described above was already opened here https://github.com/valkey-io/valkey/issues/369
The MOVED redirect contains a hash slot, but commands without keys have no hash slot. If we return slot -1 or something like that, it can break existing clients.
Many clients reload the slot mapping (cluster slots) when the get a moved redirect. Maybe they should do that also if they get a READONLY reply from a node they think is a primary? We could recommend this in the cluster documentation...
We could also return a valid hash slot (for example lowest / highest / random) owned by the current shard to utilize the existing redirection mechanism without breaking clients.
I suggest to allow reply with -REDIRECT error (like in CMD) which will redirect the request to the current primary of the replica node. In order to limit the client breakage, we will only perform that in case the client explicitly stated the REDIRECT capability in the initial negotiation.
This will also allow handling other cases like flushall.
In cases where there was a failover and the client is counting on periodic topology refreshes, it might be that the client will have long period of time (depending on the refresh frequency) in which it will not be able to progress.
@ranshid Cluster replicas redirect even read commands like GET if you don't send the READONLY command first. It doesn't redirect RANDOMKEY, SCAN and KEYS though (issue #369). Should we return REDIRECT for these commands too?