`[MasterReplica] ReadFrom.REPLICA_PREFERRED fails entirely when first replica is unreachable, even before Sentinel marks it +sdown`
Title:
[MasterReplica] ReadFrom.REPLICA_PREFERRED fails entirely when first replica is unreachable, even before Sentinel marks it +sdown
Labels (suggested): bug, area:master-replica, area:sentinel, status:triage
Describe the bug
When using ReadFrom.REPLICA_PREFERRED with Sentinel-managed Master-Replica topology, if the first selected replica becomes unreachable (e.g. network partition, container restart), MasterReplicaConnectionProvider#getConnectionAsync(ConnectionIntent.READ) fails immediately with RedisConnectionException, even though:
- Other replicas are healthy and responsive.
-
Sentinel has not yet marked the failed replica as
+sdown(subjective down).
This creates a critical availability gap: the client gives up on reading before the cluster coordination layer (Sentinel) has even declared the node down, defeating the purpose of high-availability read scaling.
Expected behavior
The Lettuce client should:
- Tolerate transient or partial node failures during connection selection.
- Continue attempting other replicas in the
REPLICA_PREFERREDselection list. - Only fall back to master if all replicas fail to connect.
- Respect Sentinel’s eventual consistency — do not preemptively fail reads just because one node is slow to respond.
In short: Client-level connection should be more resilient than Sentinel’s quorum-based failure detection.
Current behavior
for (RedisNodeDescription node : selection) {
connections = connections.concatWith(Mono.fromFuture(getConnection(node)));
}
Hey @xiongrl ,
we will get back to you asap, in the meantime if you can provide a minimum reproducible example this would greatly speed up the process, thanks!
Hi @tishun ,
Thank you for the quick response! Below is the complete, precise, and 100% reproducible description of the bug I’m seeing in production.
Real Production Symptom (exact scenario)
- Redis topology: Sentinel HA, 1 master + 2 slaves
- Lettuce configuration:
client.setReadFrom(ReadFrom.REPLICA_PREFERRED) - One slave (e.g.
10.15.32.68) experiences a temporary network glitch or is being restarted -
Sentinel has NOT yet marked it
+sdown(still withindown-after-milliseconds, usually 30 s)
→SENTINEL slaves mymasterstill returns both slaves
→ The other slave is 100% healthy - Lettuce topology discovery returns:
selection = [slave1 (faulty), slave2 (healthy)]
(slave1 ranks first because of nodeId alphabetical ordering) - Any read command instantly throws:
io.lettuce.core.RedisConnectionException: Unable to connect to 10.15.32.68:6379 at io.lettuce.core.masterreplica.MasterReplicaConnectionProvider.getConnectionAsync(MasterReplicaConnectionProvider.java:...)
Result: Even though a perfectly healthy slave exists and Sentinel has not declared the first one down, all reads fail immediately. This creates a severe availability gap during rolling restarts, network flaps, etc.
Root Cause — Confirmed in Lettuce 6.3.2 / 6.4.x Source
ReadFrom.REPLICA_PREFERRED is implemented as:
public static final ReadFrom REPLICA_PREFERRED = new ReadFromImpl.ReadFromReplicaPreferred();
static final class ReadFromReplicaPreferred extends OrderedPredicateReadFromAdapter {
ReadFromReplicaPreferred() {
super(IS_REPLICA, IS_UPSTREAM);
}
}
**Result**: Even though a perfectly healthy slave exists and Sentinel has not declared the first one down, **all reads fail immediately**. This creates a severe availability gap during rolling restarts, network flaps, etc.
if (OrderingReadFromAccessor.isOrderSensitive(readFrom) || selection.size() == 1) {
return connections.filter(StatefulConnection::isOpen)
.next() // should try nodes in order
.switchIfEmpty(connections.next())
.toFuture();
}
BUT the Flux is built like this:
Flux<StatefulRedisConnection<K, V>> connections = Flux.empty();
for (RedisNodeDescription node : selection) {
connections = connections.concatWith(Mono.fromFuture(getConnection(node)));
}
Flux.concatWith short-circuits on the first error.
When getConnection(slave1) fails → the first Mono errors → the entire Flux terminates → slave2 is never attempted → .next() never sees a successful connection → exception is thrown directly.
Conclusion: The intended “try the next node on failure” behavior is structurally unreachable due to concatWith semantics.