riot icon indicating copy to clipboard operation
riot copied to clipboard

Support READONLY to allow exporting data from replica nodes

Open atkretsch opened this issue 1 year ago • 1 comments

Currently, if you run riot in non-cluster mode, and configure it to point to a replica node URI, you will get MOVED responses and the export/replication will fail.

However, it would be useful to be able to export/replicate data directly from replicas so as to avoid putting additional load on the primary nodes. This would theoretically also allow greater throughput because we could run a separate parallel riot process for each shard in the cluster.

I could see this working in one of two ways:

  1. a --read-only flag (or similar) that would tell riot to send a READONLY command before any scan/read operations; then, the source URI could point to a replica node without fear of MOVED responses. In this case, it would be up to the caller of riot to understand the cluster topology and pass in the correct replica URIs, decide whether to run in parallel or in sequence, etc.
  2. abstract this behavior behind some additional flag(s) when running in -c mode, such that riot itself could handle any potential parallelization (perhaps with tuning, e.g. --replica-read-threads N). Conceptually, this would mean the caller wouldn't need to know the details of the cluster topology when invoking riot, but this is probably a much more complex change to implement.

atkretsch avatar Oct 09 '24 22:10 atkretsch