elasticsearch Failure in CoordinatorTests.testSingleNodeDiscoveryStabilisesEvenWhenDisrupted

CI Link

https://gradle-enterprise.elastic.co/s/yckwl5zjomfas

Repro line

./gradlew ':server:test' --tests "org.elasticsearch.cluster.coordination.CoordinatorTests.testSingleNodeDiscoveryStabilisesEvenWhenDisrupted" -Dtests.seed=949AB28501F8ED37:BBC20DD1A5D4EF81

Does it reproduce?

Yes

Applicable branches

main

Failure history

https://ci-stats.elastic.co/s/jenkins/goto/da1547f0-2ea6-11ed-8233-4bd40c8511bb - 3 similar-looking failures over the last 90 days

Failure excerpt

    java.lang.AssertionError: leaders
    Expected: not an empty collection
         but: was <[]>
        at __randomizedtesting.SeedInfo.seed([949AB28501F8ED37:BBC20DD1A5D4EF81]:0)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.junit.Assert.assertThat(Assert.java:956)
        at org.elasticsearch.cluster.coordination.AbstractCoordinatorTestCase$Cluster.getAnyLeader(AbstractCoordinatorTestCase.java:791)
        at org.elasticsearch.cluster.coordination.CoordinatorTests.testSingleNodeDiscoveryStabilisesEvenWhenDisrupted(CoordinatorTests.java:1939)


Suite: Test class org.elasticsearch.cluster.coordination.CoordinatorTests
  2> REPRODUCE WITH: ./gradlew ':server:test' --tests "org.elasticsearch.cluster.coordination.CoordinatorTests.testSingleNodeDiscoveryStabilisesEvenWhenDisrupted" -Dtests.seed=949AB28501F8ED37 -Dtests.locale=en-AU -Dtests.timezone=Europe/Vienna -Druntime.java=18
  2> java.lang.AssertionError: leaders
    Expected: not an empty collection
         but: was <[]>
        at __randomizedtesting.SeedInfo.seed([949AB28501F8ED37:BBC20DD1A5D4EF81]:0)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.junit.Assert.assertThat(Assert.java:956)
        at org.elasticsearch.cluster.coordination.AbstractCoordinatorTestCase$Cluster.getAnyLeader(AbstractCoordinatorTestCase.java:791)
        at org.elasticsearch.cluster.coordination.CoordinatorTests.testSingleNodeDiscoveryStabilisesEvenWhenDisrupted(CoordinatorTests.java:1939)
  2> NOTE: leaving temporary files on disk at: /home/davidturner/src/elasticsearch/server/build/testrun/test/temp/org.elasticsearch.cluster.coordination.CoordinatorTests_949AB28501F8ED37-002
  2> NOTE: test params are: codec=Asserting(Lucene94): {}, docValues:{}, maxPointsInLeafNode=1598, maxMBSortInHeap=6.485414969465074, sim=Asserting(RandomSimilarity(queryNorm=true): {}), locale=en-AU, timezone=Europe/Vienna
  2> NOTE: Linux 5.4.0-125-generic amd64/Oracle Corporation 18.0.2.1 (64-bit)/cpus=8,threads=1,free=465272544,total=536870912
  2> NOTE: All tests run in this JVM: [CoordinatorTests]

Tests with failures:
 - org.elasticsearch.cluster.coordination.CoordinatorTests.testSingleNodeDiscoveryStabilisesEvenWhenDisrupted

Sep 07 '22 12:09 DaveCTurner

Pinging @elastic/es-distributed (Team:Distributed)

Sep 07 '22 12:09 elasticsearchmachine

Looks like the stabilisation timeout is just too short, because the following patch fixes this failure:

diff --git a/server/src/test/java/org/elasticsearch/cluster/coordination/CoordinatorTests.java b/server/src/test/java/org/elasticsearch/cluster/coordination/CoordinatorTests.java
index ca7cb106be6..49cd4a3a15b 100644
--- a/server/src/test/java/org/elasticsearch/cluster/coordination/CoordinatorTests.java
+++ b/server/src/test/java/org/elasticsearch/cluster/coordination/CoordinatorTests.java
@@ -1934,7 +1934,7 @@ public class CoordinatorTests extends AbstractCoordinatorTestCase {
             // detection and ongoing publications do not time out
             cluster.runFor(ELECTION_INITIAL_TIMEOUT_SETTING.get(Settings.EMPTY).millis() + delayVariabilityMillis
             // two round trips for pre-voting and voting
-                + 4 * delayVariabilityMillis
+                + 4 * delayVariabilityMillis + 300000
                 // and then the election update
                 + clusterStateUpdateDelay, "stabilising");

Ofc that's not the actual fix, we must work out for what this computation is not accounting.

Sep 07 '22 12:09 DaveCTurner

This is a hard. I started out to go back in history to figure out when this started appearing. It started appearing with https://github.com/elastic/elasticsearch/pull/86921/files , which removed the MockSinglePrioritizingExecutor in favor of some more java-native thread executor. With the old MockSinglePrioritizingExecutor, the specific test (and random seed) passes. However, in trying to understand the difference, I figured that MockSinglePrioritizingExecutor was flawed in multiple parts, e.g., it didn't emit a good string for the task to be executed, and also threw a KillWorkerError after executing a task, which created a different order of execution than the new java-native approach. I have the impression that during the KillWorkerError, the run() of the MockSinglePrioritizingExecutor was called again, and it might insert an additional "node join" tasks in the DeterministicTaskQueue, which might have been the reason why it could faster result in stabilization. In the end, due to all these differences, I ended up stopping the comparison with MockSinglePrioritizingExecutor. Will try to figure out with the code on main branch.

Sep 20 '22 10:09 kingherc

@DaveCTurner I observe that simply adding another delayVariabilityMillis to the expression you mentioned passes the test. Would this be sufficient, or would we need to further understand the calculation?

Sep 20 '22 10:09 kingherc

I'd like to understand why we need another delay here. It could well be because #86921 added another delay, but how exactly does that feed into the calculation? Does it mean that clusterStateUpdateDelay should now be 8 * delayVariabilityMillis? If so, should we also update the definition of DEFAULT_CLUSTER_STATE_UPDATE_DELAY?

Sep 20 '22 11:09 DaveCTurner

I see, I am not perfectly yet aware of all these delays and what they mean. I will try to figure it out from the logs of comparing with MockSinglePrioritizingExecutor.

Sep 20 '22 11:09 kingherc

Does it mean that clusterStateUpdateDelay should now be 8 * delayVariabilityMillis?

Recording some notes here for future discussion: here's an excerpt of a log showing a commit which took from 5934060ms until 5934860ms, a total of 800ms.

[2022-09-20T08:19:59,812][INFO ][o.e.c.c.CoordinatorTests ] [testCanUpdateClusterStateAfterStabilisation] --> submitting value [2328060914443321389] to [{node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw}]
{nodeId={node0}{hy7wWQAAQACJS0Cz_____w}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934160, task={node0}{hy7wWQAAQACJS0Cz_____w}: master service scheduling next task]
[2022-09-20T08:19:59,812][INFO ][o.e.c.c.CoordinatorTests ] [testCanUpdateClusterStateAfterStabilisation] --> runFor(700ms) running until [5934760ms]: stabilising

1: master service task

] advanceTime: from [5934060ms] to [5934160ms]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934160, task={node0}{hy7wWQAAQACJS0Cz_____w}: master service scheduling next task]
] running task 0 of 1: {node0}{hy7wWQAAQACJS0Cz_____w}: master service scheduling next task
{nodeId={node0}{hy7wWQAAQACJS0Cz_____w}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934260, task={node0}{hy7wWQAAQACJS0Cz_____w}: publish change of cluster state from version [3] in term [1] to version [4] in term [1]]

2: state publisher task (see FakeThreadPoolMasterService#publish)

] advanceTime: from [5934160ms] to [5934260ms]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934260, task={node0}{hy7wWQAAQACJS0Cz_____w}: publish change of cluster state from version [3] in term [1] to version [4] in term [1]]
] running task 0 of 1: {node0}{hy7wWQAAQACJS0Cz_____w}: publish change of cluster state from version [3] in term [1] to version [4] in term [1]
{nodeId={node0}{hy7wWQAAQACJS0Cz_____w}}] scheduleAt: adding DeferredTask[executionTimeMillis=5964360, task={node0}{hy7wWQAAQACJS0Cz_____w}: scheduled timeout for Publication{term=1, version=4}] with extra delay of [100ms]
{nodeId={node0}{hy7wWQAAQACJS0Cz_____w}}] scheduleAt: adding DeferredTask[executionTimeMillis=5944360, task={node0}{hy7wWQAAQACJS0Cz_____w}: scheduled timeout for reporting on Publication{term=1, version=4}] with extra delay of [100ms]
{nodeId={node0}{hy7wWQAAQACJS0Cz_____w}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934360, task={node2}{MXR7YQAAQACXXRe5_____w}: [485][internal:cluster/coordination/publish_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node2}{MXR7YQAAQACXXRe5_____w}{0.0.0.0}{0.0.0.0:18}{cdfhilmrstw}]
{nodeId={node0}{hy7wWQAAQACJS0Cz_____w}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934360, task={node1}{MhCa3f__T_-UXusCAAAAAA}: [486][internal:cluster/coordination/publish_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node1}{MhCa3f__T_-UXusCAAAAAA}{0.0.0.0}{0.0.0.0:20}{cdfhilmrstw}]
{nodeId={node0}{hy7wWQAAQACJS0Cz_____w}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934360, task={node0}{hy7wWQAAQACJS0Cz_____w}: ActionRunnable#wrap[handling cluster state version [4] locally on [{node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw}]]]

3: deliver publish requests

] advanceTime: from [5934260ms] to [5934274ms]
] advanceTime: from [5934274ms] to [5934300ms]
] advanceTime: from [5934300ms] to [5934302ms]
] advanceTime: from [5934302ms] to [5934360ms]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934360, task={node2}{MXR7YQAAQACXXRe5_____w}: [485][internal:cluster/coordination/publish_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node2}{MXR7YQAAQACXXRe5_____w}{0.0.0.0}{0.0.0.0:18}{cdfhilmrstw}]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934360, task={node1}{MhCa3f__T_-UXusCAAAAAA}: [486][internal:cluster/coordination/publish_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node1}{MhCa3f__T_-UXusCAAAAAA}{0.0.0.0}{0.0.0.0:20}{cdfhilmrstw}]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934360, task={node0}{hy7wWQAAQACJS0Cz_____w}: ActionRunnable#wrap[handling cluster state version [4] locally on [{node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw}]]]
] running task 1 of 3: {node1}{MhCa3f__T_-UXusCAAAAAA}: [486][internal:cluster/coordination/publish_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node1}{MhCa3f__T_-UXusCAAAAAA}{0.0.0.0}{0.0.0.0:20}{cdfhilmrstw}
{nodeId={node1}{MhCa3f__T_-UXusCAAAAAA}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934460, task={node0}{hy7wWQAAQACJS0Cz_____w}: response to [486][internal:cluster/coordination/publish_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node1}{MhCa3f__T_-UXusCAAAAAA}{0.0.0.0}{0.0.0.0:20}{cdfhilmrstw}]
] running task 0 of 2: {node2}{MXR7YQAAQACXXRe5_____w}: [485][internal:cluster/coordination/publish_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node2}{MXR7YQAAQACXXRe5_____w}{0.0.0.0}{0.0.0.0:18}{cdfhilmrstw}
{nodeId={node2}{MXR7YQAAQACXXRe5_____w}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934460, task={node0}{hy7wWQAAQACJS0Cz_____w}: response to [485][internal:cluster/coordination/publish_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node2}{MXR7YQAAQACXXRe5_____w}{0.0.0.0}{0.0.0.0:18}{cdfhilmrstw}]
] running task 0 of 1: {node0}{hy7wWQAAQACJS0Cz_____w}: ActionRunnable#wrap[handling cluster state version [4] locally on [{node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw}]]

4: receive publish responses

] advanceTime: from [5934360ms] to [5934374ms]
] advanceTime: from [5934374ms] to [5934400ms]
] advanceTime: from [5934400ms] to [5934460ms]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934460, task={node0}{hy7wWQAAQACJS0Cz_____w}: response to [486][internal:cluster/coordination/publish_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node1}{MhCa3f__T_-UXusCAAAAAA}{0.0.0.0}{0.0.0.0:20}{cdfhilmrstw}]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934460, task={node0}{hy7wWQAAQACJS0Cz_____w}: response to [485][internal:cluster/coordination/publish_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node2}{MXR7YQAAQACXXRe5_____w}{0.0.0.0}{0.0.0.0:18}{cdfhilmrstw}]
] running task 0 of 2: {node0}{hy7wWQAAQACJS0Cz_____w}: response to [486][internal:cluster/coordination/publish_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node1}{MhCa3f__T_-UXusCAAAAAA}{0.0.0.0}{0.0.0.0:20}{cdfhilmrstw}
{nodeId={node0}{hy7wWQAAQACJS0Cz_____w}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934560, task={node1}{MhCa3f__T_-UXusCAAAAAA}: [488][internal:cluster/coordination/commit_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node1}{MhCa3f__T_-UXusCAAAAAA}{0.0.0.0}{0.0.0.0:20}{cdfhilmrstw}]
{nodeId={node0}{hy7wWQAAQACJS0Cz_____w}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934560, task={node0}{hy7wWQAAQACJS0Cz_____w}: processing of [489][internal:cluster/coordination/commit_state]: ApplyCommitRequest{term=1, version=4, sourceNode={node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw}}]
] running task 0 of 1: {node0}{hy7wWQAAQACJS0Cz_____w}: response to [485][internal:cluster/coordination/publish_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node2}{MXR7YQAAQACXXRe5_____w}{0.0.0.0}{0.0.0.0:18}{cdfhilmrstw}
{nodeId={node0}{hy7wWQAAQACJS0Cz_____w}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934560, task={node2}{MXR7YQAAQACXXRe5_____w}: [490][internal:cluster/coordination/commit_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node2}{MXR7YQAAQACXXRe5_____w}{0.0.0.0}{0.0.0.0:18}{cdfhilmrstw}]

5: send commit requests

] advanceTime: from [5934460ms] to [5934474ms]
] advanceTime: from [5934474ms] to [5934500ms]
] advanceTime: from [5934500ms] to [5934560ms]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934560, task={node1}{MhCa3f__T_-UXusCAAAAAA}: [488][internal:cluster/coordination/commit_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node1}{MhCa3f__T_-UXusCAAAAAA}{0.0.0.0}{0.0.0.0:20}{cdfhilmrstw}]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934560, task={node0}{hy7wWQAAQACJS0Cz_____w}: processing of [489][internal:cluster/coordination/commit_state]: ApplyCommitRequest{term=1, version=4, sourceNode={node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw}}]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934560, task={node2}{MXR7YQAAQACXXRe5_____w}: [490][internal:cluster/coordination/commit_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node2}{MXR7YQAAQACXXRe5_____w}{0.0.0.0}{0.0.0.0:18}{cdfhilmrstw}]
] running task 0 of 3: {node1}{MhCa3f__T_-UXusCAAAAAA}: [488][internal:cluster/coordination/commit_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node1}{MhCa3f__T_-UXusCAAAAAA}{0.0.0.0}{0.0.0.0:20}{cdfhilmrstw}
{nodeId={node1}{MhCa3f__T_-UXusCAAAAAA}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934660, task=DisruptableClusterApplierService[[ApplyCommitRequest{term=1, version=4, sourceNode={node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw}}]]]
] running task 0 of 2: {node0}{hy7wWQAAQACJS0Cz_____w}: processing of [489][internal:cluster/coordination/commit_state]: ApplyCommitRequest{term=1, version=4, sourceNode={node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw}}
{nodeId={node0}{hy7wWQAAQACJS0Cz_____w}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934660, task={node0}{hy7wWQAAQACJS0Cz_____w}: delivery of response to [489][internal:cluster/coordination/commit_state]: Empty{}]
] running task 0 of 1: {node2}{MXR7YQAAQACXXRe5_____w}: [490][internal:cluster/coordination/commit_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node2}{MXR7YQAAQACXXRe5_____w}{0.0.0.0}{0.0.0.0:18}{cdfhilmrstw}
{nodeId={node2}{MXR7YQAAQACXXRe5_____w}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934660, task=DisruptableClusterApplierService[[ApplyCommitRequest{term=1, version=4, sourceNode={node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw}}]]]

6: apply new state on remote nodes

] advanceTime: from [5934560ms] to [5934660ms]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934660, task=DisruptableClusterApplierService[[ApplyCommitRequest{term=1, version=4, sourceNode={node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw}}]]]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934660, task={node0}{hy7wWQAAQACJS0Cz_____w}: delivery of response to [489][internal:cluster/coordination/commit_state]: Empty{}]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934660, task=DisruptableClusterApplierService[[ApplyCommitRequest{term=1, version=4, sourceNode={node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw}}]]]
] running task 0 of 3: DisruptableClusterApplierService[[ApplyCommitRequest{term=1, version=4, sourceNode={node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw}}]]
{nodeId={node1}{MhCa3f__T_-UXusCAAAAAA}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934760, task={node0}{hy7wWQAAQACJS0Cz_____w}: response to [488][internal:cluster/coordination/commit_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node1}{MhCa3f__T_-UXusCAAAAAA}{0.0.0.0}{0.0.0.0:20}{cdfhilmrstw}]
] running task 1 of 2: DisruptableClusterApplierService[[ApplyCommitRequest{term=1, version=4, sourceNode={node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw}}]]
{nodeId={node2}{MXR7YQAAQACXXRe5_____w}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934760, task={node0}{hy7wWQAAQACJS0Cz_____w}: response to [490][internal:cluster/coordination/commit_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node2}{MXR7YQAAQACXXRe5_____w}{0.0.0.0}{0.0.0.0:18}{cdfhilmrstw}]
] running task 0 of 1: {node0}{hy7wWQAAQACJS0Cz_____w}: delivery of response to [489][internal:cluster/coordination/commit_state]: Empty{}

7: receive commit responses

] advanceTime: from [5934660ms] to [5934683ms]
] advanceTime: from [5934683ms] to [5934760ms]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934760, task={node0}{hy7wWQAAQACJS0Cz_____w}: response to [488][internal:cluster/coordination/commit_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node1}{MhCa3f__T_-UXusCAAAAAA}{0.0.0.0}{0.0.0.0:20}{cdfhilmrstw}]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934760, task={node0}{hy7wWQAAQACJS0Cz_____w}: response to [490][internal:cluster/coordination/commit_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node2}{MXR7YQAAQACXXRe5_____w}{0.0.0.0}{0.0.0.0:18}{cdfhilmrstw}]
--> runFor(700ms) completed run until [5934760ms]: stabilising
--> runFor(700ms) running until [5935460ms]: draining task queue
] running task 0 of 2: {node0}{hy7wWQAAQACJS0Cz_____w}: response to [488][internal:cluster/coordination/commit_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node1}{MhCa3f__T_-UXusCAAAAAA}{0.0.0.0}{0.0.0.0:20}{cdfhilmrstw}
] running task 0 of 1: {node0}{hy7wWQAAQACJS0Cz_____w}: response to [490][internal:cluster/coordination/commit_state] from {node0}{hy7wWQAAQACJS0Cz_____w}{0.0.0.0}{0.0.0.0:21}{cdfhilmrstw} to {node2}{MXR7YQAAQACXXRe5_____w}{0.0.0.0}{0.0.0.0:18}{cdfhilmrstw}
{nodeId={node0}{hy7wWQAAQACJS0Cz_____w}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=5934860, task=DisruptableClusterApplierService[[Publication{term=1, version=4}]]]

8: apply new state on master node

] advanceTime: from [5934760ms] to [5934766ms]
] advanceTime: from [5934766ms] to [5934783ms]
] advanceTime: from [5934783ms] to [5934860ms]
] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=5934860, task=DisruptableClusterApplierService[[Publication{term=1, version=4}]]]
] running task 0 of 1: DisruptableClusterApplierService[[Publication{term=1, version=4}]]

I don't think this was affected by #86921 because even the old MockSinglePrioritizingExecutor uses scheduleNow to queue up the task, so that PR didn't change the number of scheduleNow calls. Instead it looks like we added an extra scheduleNow call in #61455.

Sep 21 '22 09:09 DaveCTurner

I am afraid this happened again, see https://gradle-enterprise.elastic.co/s/cwlcjoocvaehg/tests/:server:test/org.elasticsearch.cluster.coordination.CoordinatorTests/testSingleNodeDiscoveryStabilisesEvenWhenDisrupted?top-execution=1. I am reopening this issue, if this is not the right thing to do, let me know and I will create a new one.

Oct 28 '22 14:10 gmarouli

Mary's issue is reproducible. Adding another delayVariabilityMillis in the runFor expression makes it pass. So, I delved once more into @DaveCTurner 's approach of running the testCanUpdateClusterStateAfterStabilisation test with fixed delays of 100ms in org.elasticsearch.common.util.concurrent.DeterministicTaskQueue#scheduleNow and scheduleAt functions. And I believe we still have 8 delays/stages in main for the DEFAULT_CLUSTER_STATE_UPDATE_DELAY. Here is an excerpt:

[2022-10-31T16:47:55,184][INFO ][o.e.c.c.CoordinatorTests ] [testCanUpdateClusterStateAfterStabilisation] --&gt; runFor(800ms) running until [2067930ms]: stabilising
[2022-10-31T16:47:55,184][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] running task 0 of 2: {node1}{lSqmpP__T_-qutLN_____w}: timeout handler for [190][internal:coordination/fault_detection/leader_check]
[2022-10-31T16:47:55,185][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] running task 0 of 1: {node0}{m4mHn___T_-dxn9_AAAAAA}: timeout handler for [204][internal:coordination/fault_detection/follower_check]

1: Master service task

[2022-10-31T16:47:55,185][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: from [2067130ms] to [2067230ms]
[2022-10-31T16:47:55,185][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=2067230, task={node0}{m4mHn___T_-dxn9_AAAAAA}: master service scheduling next task]
{m4mHn___T_-dxn9_AAAAAA}}] register 8 [master] [publish_cluster_state_update] [publication of cluster state [12]]
[2022-10-31T16:47:55,187][TRACE][o.e.c.s.MasterService    ] [testCanUpdateClusterStateAfterStabilisation{nodeId={node0}{m4mHn___T_-dxn9_AAAAAA}}] cluster state updated, source [new value [0=3964185402078690287]]
DeferredTask[executionTimeMillis=2067330, task={node0}{m4mHn___T_-dxn9_AAAAAA}: publish change of cluster state from version [11] in term [8] to version [12] in term [8]]

2: state publisher task

[2022-10-31T16:47:55,188][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: from [2067230ms] to [2067330ms]
[2022-10-31T16:47:55,188][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=2067330, task={node0}{m4mHn___T_-dxn9_AAAAAA}: publish change of cluster state from version [11] in term [8] to version [12] in term [8]]
[2022-10-31T16:47:55,206][TRACE][o.e.c.c.LeaderChecker    ] [testCanUpdateClusterStateAfterStabilisation{nodeId={node0}{m4mHn___T_-dxn9_AAAAAA}}] setCurrentNodes: [{node1}{lSqmpP__T_-qutLN_____w}{0.0.0.0}{0.0.0.0:3}{IScdfhilmrstw}, {node0}{m4mHn___T_-dxn9_AAAAAA}{0.0.0.0}{0.0.0.0:6}{IScdfhilmrstw}]
[2022-10-31T16:47:55,207][TRACE][o.e.c.c.C.CoordinatorPublication] [testCanUpdateClusterStateAfterStabilisation{nodeId={node0}{m4mHn___T_-dxn9_AAAAAA}}] publishing PublishRequest{term=8, version=12, state=cluster uuid: tYpV1P__T_-u0UMBAAAAAA [committed: true]
[2022-10-31T16:47:55,208][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation{nodeId={node0}{m4mHn___T_-dxn9_AAAAAA}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=2067430, task={node0}{m4mHn___T_-dxn9_AAAAAA}: ActionRunnable#wrap[handling cluster state version [12] locally on [{node0}{m4mHn___T_-dxn9_AAAAAA}{0.0.0.0}{0.0.0.0:6}{IScdfhilmrstw}]]]

3: deliver publish requests

[2022-10-31T16:47:55,209][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: from [2067330ms] to [2067430ms]
[2022-10-31T16:47:55,209][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=2067430, task={node1}{lSqmpP__T_-qutLN_____w}: [212][internal:cluster/coordination/publish_state] from {node0}{m4mHn___T_-dxn9_AAAAAA}{0.0.0.0}{0.0.0.0:6}{IScdfhilmrstw} to {node1}{lSqmpP__T_-qutLN_____w}{0.0.0.0}{0.0.0.0:3}{IScdfhilmrstw}]
[2022-10-31T16:47:55,209][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=2067430, task={node0}{m4mHn___T_-dxn9_AAAAAA}: ActionRunnable#wrap[handling cluster state version [12] locally on [{node0}{m4mHn___T_-dxn9_AAAAAA}{0.0.0.0}{0.0.0.0:6}{IScdfhilmrstw}]]]
[2022-10-31T16:47:55,219][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation{nodeId={node0}{m4mHn___T_-dxn9_AAAAAA}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=2067530, task={node0}{m4mHn___T_-dxn9_AAAAAA}: processing of [214][internal:cluster/coordination/commit_state]: ApplyCommitRequest{term=8, version=12, sourceNode={node0}{m4mHn___T_-dxn9_AAAAAA}{0.0.0.0}{0.0.0.0:6}{IScdfhilmrstw}}]

4: receive publish responses

[2022-10-31T16:47:55,220][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: from [2067430ms] to [2067530ms]
[2022-10-31T16:47:55,220][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=2067530, task={node0}{m4mHn___T_-dxn9_AAAAAA}: response to [212][internal:cluster/coordination/publish_state] from {node0}{m4mHn___T_-dxn9_AAAAAA}{0.0.0.0}{0.0.0.0:6}{IScdfhilmrstw} to {node1}{lSqmpP__T_-qutLN_____w}{0.0.0.0}{0.0.0.0:3}{IScdfhilmrstw}]
[2022-10-31T16:47:55,223][TRACE][o.e.t.TaskManager        ] [testCanUpdateClusterStateAfterStabilisation{nodeId={node0}{m4mHn___T_-dxn9_AAAAAA}}] unregister task for id: 213
[2022-10-31T16:47:55,223][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation{nodeId={node0}{m4mHn___T_-dxn9_AAAAAA}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=2067630, task={node1}{lSqmpP__T_-qutLN_____w}: response to [198][internal:coordination/fault_detection/leader_check] from {node1}{lSqmpP__T_-qutLN_____w}{0.0.0.0}{0.0.0.0:3}{IScdfhilmrstw} to {node0}{m4mHn___T_-dxn9_AAAAAA}{0.0.0.0}{0.0.0.0:6}{IScdfhilmrstw}]

5: send commit requests

[2022-10-31T16:47:55,223][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: from [2067530ms] to [2067630ms]
[2022-10-31T16:47:55,223][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=2067630, task={node0}{m4mHn___T_-dxn9_AAAAAA}: delivery of response to [214][internal:cluster/coordination/commit_state]: Empty{}]
[2022-10-31T16:47:55,223][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=2067630, task={node0}{m4mHn___T_-dxn9_AAAAAA}: response to [213][internal:coordination/fault_detection/follower_check] from {node0}{m4mHn___T_-dxn9_AAAAAA}{0.0.0.0}{0.0.0.0:6}{IScdfhilmrstw} to {node1}{lSqmpP__T_-qutLN_____w}{0.0.0.0}{0.0.0.0:3}{IScdfhilmrstw}]
[2022-10-31T16:47:55,224][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=2067630, task={node1}{lSqmpP__T_-qutLN_____w}: [215][internal:cluster/coordination/commit_state] from {node0}{m4mHn___T_-dxn9_AAAAAA}{0.0.0.0}{0.0.0.0:6}{IScdfhilmrstw} to {node1}{lSqmpP__T_-qutLN_____w}{0.0.0.0}{0.0.0.0:3}{IScdfhilmrstw}]
[2022-10-31T16:47:55,225][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation{nodeId={node0}{m4mHn___T_-dxn9_AAAAAA}}] scheduleAt: adding DeferredTask[executionTimeMillis=2068730, task={node0}{m4mHn___T_-dxn9_AAAAAA}: FollowerChecker{discoveryNode={node1}{lSqmpP__T_-qutLN_____w}{0.0.0.0}{0.0.0.0:3}{IScdfhilmrstw}, failureCountSinceLastSuccess=0, timeoutCountSinceLastSuccess=0, [cluster.fault_detection.follower_check.retry_count]=3}::handleWakeUp] with extra delay of [100ms]

6: apply new state on remote nodes

[2022-10-31T16:47:55,226][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: from [2067630ms] to [2067730ms]
[2022-10-31T16:47:55,226][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=2067730, task=DisruptableClusterApplierService[[ApplyCommitRequest{term=8, version=12, sourceNode={node0}{m4mHn___T_-dxn9_AAAAAA}{0.0.0.0}{0.0.0.0:6}{IScdfhilmrstw}}]]]
[2022-10-31T16:47:55,226][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] running task 0 of 1: DisruptableClusterApplierService[[ApplyCommitRequest{term=8, version=12, sourceNode={node0}{m4mHn___T_-dxn9_AAAAAA}{0.0.0.0}{0.0.0.0:6}{IScdfhilmrstw}}]]
[2022-10-31T16:47:55,228][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation{nodeId={node1}{lSqmpP__T_-qutLN_____w}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=2067830, task={node0}{m4mHn___T_-dxn9_AAAAAA}: response to [215][internal:cluster/coordination/commit_state] from {node0}{m4mHn___T_-dxn9_AAAAAA}{0.0.0.0}{0.0.0.0:6}{IScdfhilmrstw} to {node1}{lSqmpP__T_-qutLN_____w}{0.0.0.0}{0.0.0.0:3}{IScdfhilmrstw}]

7: receive commit responses

[2022-10-31T16:47:55,228][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: from [2067730ms] to [2067830ms]
[2022-10-31T16:47:55,228][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=2067830, task={node0}{m4mHn___T_-dxn9_AAAAAA}: response to [215][internal:cluster/coordination/commit_state] from {node0}{m4mHn___T_-dxn9_AAAAAA}{0.0.0.0}{0.0.0.0:6}{IScdfhilmrstw} to {node1}{lSqmpP__T_-qutLN_____w}{0.0.0.0}{0.0.0.0:3}{IScdfhilmrstw}]
[2022-10-31T16:47:55,228][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] running task 0 of 1: {node0}{m4mHn___T_-dxn9_AAAAAA}: response to [215][internal:cluster/coordination/commit_state] from {node0}{m4mHn___T_-dxn9_AAAAAA}{0.0.0.0}{0.0.0.0:6}{IScdfhilmrstw} to {node1}{lSqmpP__T_-qutLN_____w}{0.0.0.0}{0.0.0.0:3}{IScdfhilmrstw}
[2022-10-31T16:47:55,229][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation{nodeId={node0}{m4mHn___T_-dxn9_AAAAAA}}] scheduleNow: delaying [100ms], scheduling DeferredTask[executionTimeMillis=2067930, task=DisruptableClusterApplierService[[Publication{term=8, version=12}]]]
[2022-10-31T16:47:55,229][TRACE][o.e.c.c.C.CoordinatorPublication] [testCanUpdateClusterStateAfterStabilisation{nodeId={node0}{m4mHn___T_-dxn9_AAAAAA}}] onPossibleCompletion: [Publication{term=8, version=12}] was successful

8: apply new state on master node

[2022-10-31T16:47:55,229][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: from [2067830ms] to [2067930ms]
[2022-10-31T16:47:55,229][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testCanUpdateClusterStateAfterStabilisation] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=2067930, task=DisruptableClusterApplierService[[Publication{term=8, version=12}]]]
[2022-10-31T16:47:55,229][INFO ][o.e.c.c.CoordinatorTests ] [testCanUpdateClusterStateAfterStabilisation] --&gt; runFor(800ms) completed run until [2067930ms]: stabilising

So, I believe now this is a different problem. I tried to compare the outputs between the failed one and the successful one (where I added delayVariabilityMillis in the runFor expression).

Here is the failed one:

[2022-10-31T20:55:56,985][INFO ][o.e.c.c.CoordinatorTests ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] --&gt; runFor(3140601ms) running until [3140601ms]: stabilising
[2022-10-31T20:55:56,988][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: from [0ms] to [91ms]
...
[2022-10-31T20:56:01,076][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: from [3073086ms] to [3080013ms]
[2022-10-31T20:56:01,076][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=3080013, task={node0}{t71ERgAAQACqOqV0AAAAAA}: emit warning if cluster not formed]
[2022-10-31T20:56:01,076][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] running task 0 of 1: {node0}{t71ERgAAQACqOqV0AAAAAA}: emit warning if cluster not formed
[2022-10-31T20:56:01,076][WARN ][o.e.c.c.JoinHelper       ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] last failed join attempt was 61ms ago, failed to join {node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw} with JoinRequest{sourceNode={node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}, minimumTerm=13, optionalJoin=Optional[Join{term=14, lastAcceptedTerm=7, lastAcceptedVersion=1, sourceNode={node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}, targetNode={node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}}]}
org.elasticsearch.transport.RemoteTransportException: [][0.0.0.0:1][internal:cluster/coordination/join]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: incoming term 14 does not match current term 15
	at org.elasticsearch.cluster.coordination.CoordinationState.handleJoin(CoordinationState.java:218) ~[main/:?]
	at org.elasticsearch.cluster.coordination.Coordinator.handleJoin(Coordinator.java:1342) ~[main/:?]
	at java.util.Optional.ifPresent(Optional.java:178) ~[?:?]
	at org.elasticsearch.cluster.coordination.Coordinator.processJoinRequest(Coordinator.java:753) ~[main/:?]
	at org.elasticsearch.cluster.coordination.Coordinator$1.lambda$onResponse$1(Coordinator.java:612) ~[main/:?]
        ...
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850) ~[randomizedtesting-runner-2.8.0.jar:?]
	at java.lang.Thread.run(Thread.java:1589) ~[?:?]
[2022-10-31T20:56:01,087][WARN ][o.e.c.c.ClusterFormationFailureHelper] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] master not discovered or elected yet, an election requires a node with id [node0], have discovered possible quorum [{node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}]; discovery will continue using [] from hosts providers and [{node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}] from last-known cluster state; node term 15, last-accepted version 1 in term 7; joining [{node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}] in term [15] has status [waiting for response] after [3.4m/208138ms]
[2022-10-31T20:56:01,088][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] scheduleAt: adding DeferredTask[executionTimeMillis=3113554, task={node0}{t71ERgAAQACqOqV0AAAAAA}: emit warning if cluster not formed] with extra delay of [23541ms]
[2022-10-31T20:56:01,088][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: from [3080013ms] to [3105607ms]
[2022-10-31T20:56:01,088][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=3105607, task={node0}{t71ERgAAQACqOqV0AAAAAA}: PeerFinder handling wakeup]
[2022-10-31T20:56:01,089][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] running task 0 of 1: {node0}{t71ERgAAQACqOqV0AAAAAA}: PeerFinder handling wakeup
[2022-10-31T20:56:01,091][TRACE][o.e.d.PeerFinder         ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] probing master nodes from cluster state: [{node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}]
[2022-10-31T20:56:01,092][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] scheduleAt: adding DeferredTask[executionTimeMillis=3321985, task={node0}{t71ERgAAQACqOqV0AAAAAA}: PeerFinder handling wakeup] with extra delay of [215378ms]
[2022-10-31T20:56:01,093][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: from [3105607ms] to [3113554ms]
[2022-10-31T20:56:01,093][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=3113554, task={node0}{t71ERgAAQACqOqV0AAAAAA}: emit warning if cluster not formed]
[2022-10-31T20:56:01,093][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] running task 0 of 1: {node0}{t71ERgAAQACqOqV0AAAAAA}: emit warning if cluster not formed
[2022-10-31T20:56:01,095][WARN ][o.e.c.c.ClusterFormationFailureHelper] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] master not discovered or elected yet, an election requires a node with id [node0], have discovered possible quorum [{node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}]; discovery will continue using [] from hosts providers and [{node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}] from last-known cluster state; node term 15, last-accepted version 1 in term 7; joining [{node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}] in term [15] has status [waiting for response] after [4m/241679ms]
[2022-10-31T20:56:01,095][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] scheduleAt: adding DeferredTask[executionTimeMillis=3269433, task={node0}{t71ERgAAQACqOqV0AAAAAA}: emit warning if cluster not formed] with extra delay of [145879ms]
[2022-10-31T20:56:01,095][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: from [3113554ms] to [3135001ms]
[2022-10-31T20:56:01,096][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=3135001, task={node0}{t71ERgAAQACqOqV0AAAAAA}: timeout handler for [91][internal:cluster/formation/info]]
[2022-10-31T20:56:01,096][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] running task 0 of 1: {node0}{t71ERgAAQACqOqV0AAAAAA}: timeout handler for [91][internal:cluster/formation/info]
[2022-10-31T20:56:01,096][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: from [3135001ms] to [3150655ms]
[2022-10-31T20:56:01,097][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=3150655, task={node0}{t71ERgAAQACqOqV0AAAAAA}: timeout handler for [92][internal:cluster/formation/info]]
[2022-10-31T20:56:01,097][INFO ][o.e.c.c.CoordinatorTests ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] --&gt; runFor(3140601ms) completed run until [3140601ms]: stabilising

And here's the one from the successful one (which simply runs longer):

[2022-10-31T21:16:40,865][INFO ][o.e.c.c.CoordinatorTests ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] --&gt; runFor(3382178ms) running until [3382178ms]: stabilising
[2022-10-31T21:16:40,866][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: from [0ms] to [91ms]
...
[2022-10-31T21:16:41,804][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: from [3073086ms] to [3080013ms]
[2022-10-31T21:16:41,804][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=3080013, task={node0}{t71ERgAAQACqOqV0AAAAAA}: emit warning if cluster not formed]
[2022-10-31T21:16:41,804][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] running task 0 of 1: {node0}{t71ERgAAQACqOqV0AAAAAA}: emit warning if cluster not formed
[2022-10-31T21:16:41,804][WARN ][o.e.c.c.JoinHelper       ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] last failed join attempt was 13ms ago, failed to join {node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw} with JoinRequest{sourceNode={node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}, minimumTerm=13, optionalJoin=Optional[Join{term=14, lastAcceptedTerm=7, lastAcceptedVersion=1, sourceNode={node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}, targetNode={node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}}]}
org.elasticsearch.transport.RemoteTransportException: [][0.0.0.0:1][internal:cluster/coordination/join]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: incoming term 14 does not match current term 15
	at org.elasticsearch.cluster.coordination.CoordinationState.handleJoin(CoordinationState.java:218) ~[main/:?]
	at org.elasticsearch.cluster.coordination.Coordinator.handleJoin(Coordinator.java:1342) ~[main/:?]
	at java.util.Optional.ifPresent(Optional.java:178) ~[?:?]
	at org.elasticsearch.cluster.coordination.Coordinator.processJoinRequest(Coordinator.java:753) ~[main/:?]
...
[2022-10-31T21:16:41,852][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: from [3306117ms] to [3306416ms]
[2022-10-31T21:16:41,852][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=3306416, task={node0}{t71ERgAAQACqOqV0AAAAAA}: delivery of response to [95][internal:cluster/coordination/join/ping]: Empty{}]
[2022-10-31T21:16:41,852][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] running task 0 of 1: {node0}{t71ERgAAQACqOqV0AAAAAA}: delivery of response to [95][internal:cluster/coordination/join/ping]: Empty{}
[2022-10-31T21:16:41,852][DEBUG][o.e.c.c.CoordinationState] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] handleJoin: ignored join due to term mismatch (expected: [18], actual: [15])
[2022-10-31T21:16:41,852][DEBUG][o.e.c.c.Coordinator      ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] failed to add Join{term=15, lastAcceptedTerm=7, lastAcceptedVersion=1, sourceNode={node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}, targetNode={node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}} - ignoring
org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: incoming term 15 does not match current term 18
	at org.elasticsearch.cluster.coordination.CoordinationState.handleJoin(CoordinationState.java:218) ~[main/:?]
	at org.elasticsearch.cluster.coordination.Coordinator.handleJoinIgnoringExceptions(Coordinator.java:1352) ~[main/:?]
	at org.elasticsearch.cluster.coordination.Coordinator.handleJoin(Coordinator.java:1332) ~[main/:?]
	at java.util.Optional.ifPresent(Optional.java:178) ~[?:?]
        ...
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850) ~[randomizedtesting-runner-2.8.0.jar:?]
	at java.lang.Thread.run(Thread.java:1589) ~[?:?]
[2022-10-31T21:16:41,855][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: from [3306416ms] to [3321985ms]
[2022-10-31T21:16:41,856][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=3321985, task={node0}{t71ERgAAQACqOqV0AAAAAA}: PeerFinder handling wakeup]
[2022-10-31T21:16:41,856][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] running task 0 of 1: {node0}{t71ERgAAQACqOqV0AAAAAA}: PeerFinder handling wakeup
[2022-10-31T21:16:41,856][TRACE][o.e.d.PeerFinder         ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] not active
[2022-10-31T21:16:41,856][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: from [3321985ms] to [3325256ms]
[2022-10-31T21:16:41,856][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=3325256, task={node0}{t71ERgAAQACqOqV0AAAAAA}: async initialization of indices lookup for cluster state 2]
[2022-10-31T21:16:41,856][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] running task 0 of 1: {node0}{t71ERgAAQACqOqV0AAAAAA}: async initialization of indices lookup for cluster state 2
[2022-10-31T21:16:41,857][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: from [3325256ms] to [3342304ms]
[2022-10-31T21:16:41,857][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=3342304, task=DisruptableClusterApplierService[[joining {node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}]]]
[2022-10-31T21:16:41,857][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] running task 0 of 1: DisruptableClusterApplierService[[joining {node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}]]
[2022-10-31T21:16:41,857][DEBUG][o.e.c.s.ClusterApplierService] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] processing [joining {node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}]: execute
[2022-10-31T21:16:41,857][DEBUG][o.e.c.s.ClusterApplierService] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] processing [joining {node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}]: took [0s] no change in cluster state
[2022-10-31T21:16:41,857][TRACE][o.e.t.T.tracer           ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] [109][internal:cluster/coordination/join] sent to [{node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}] (timeout: [null])
[2022-10-31T21:16:41,857][TRACE][o.e.t.T.tracer           ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] [109][internal:cluster/coordination/join] received request
[2022-10-31T21:16:41,857][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] scheduleNow: delaying [205799ms], scheduling DeferredTask[executionTimeMillis=3548103, task={node0}{t71ERgAAQACqOqV0AAAAAA}: processing of [109][internal:cluster/coordination/join]: JoinRequest{sourceNode={node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}, minimumTerm=15, optionalJoin=Optional[Join{term=16, lastAcceptedTerm=7, lastAcceptedVersion=1, sourceNode={node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}, targetNode={node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}}]}]
[2022-10-31T21:16:41,858][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: from [3342304ms] to [3359474ms]
[2022-10-31T21:16:41,858][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=3359474, task={node0}{t71ERgAAQACqOqV0AAAAAA}: org.elasticsearch.cluster.coordination.CoordinationDiagnosticsService$$Lambda$1189/0x000000080148e7a8@78688d8a]
[2022-10-31T21:16:41,858][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] running task 0 of 1: {node0}{t71ERgAAQACqOqV0AAAAAA}: org.elasticsearch.cluster.coordination.CoordinationDiagnosticsService$$Lambda$1189/0x000000080148e7a8@78688d8a
[2022-10-31T21:16:41,859][TRACE][o.e.c.c.CoordinationDiagnosticsService] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] Opened connection to {node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}, making transport request
[2022-10-31T21:16:41,859][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] scheduleAt: adding DeferredTask[executionTimeMillis=3440500, task={node0}{t71ERgAAQACqOqV0AAAAAA}: timeout handler for [110][internal:cluster/formation/info]] with extra delay of [71026ms]
[2022-10-31T21:16:41,859][TRACE][o.e.t.T.tracer           ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] [110][internal:cluster/formation/info] sent to [{node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}] (timeout: [10s])
[2022-10-31T21:16:41,859][TRACE][o.e.t.T.tracer           ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] [110][internal:cluster/formation/info] received request
[2022-10-31T21:16:41,859][TRACE][o.e.t.TaskManager        ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] register 109 [direct] [internal:cluster/formation/info] []
[2022-10-31T21:16:41,859][TRACE][o.e.t.TaskManager        ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] unregister task for id: 109
[2022-10-31T21:16:41,859][TRACE][o.e.t.T.tracer           ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] [110][internal:cluster/formation/info] sent response
[2022-10-31T21:16:41,859][TRACE][o.e.t.T.tracer           ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] [110][internal:cluster/formation/info] received response from [{node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw}]
[2022-10-31T21:16:41,860][TRACE][o.e.c.c.CoordinationDiagnosticsService] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] Received remote response from {node0}{t71ERgAAQACqOqV0AAAAAA}{0.0.0.0}{0.0.0.0:1}{IScdfhilmrstw} in 44.3ms
[2022-10-31T21:16:41,860][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted{nodeId={node0}{t71ERgAAQACqOqV0AAAAAA}}] scheduleAt: adding DeferredTask[executionTimeMillis=3587329, task={node0}{t71ERgAAQACqOqV0AAAAAA}: org.elasticsearch.cluster.coordination.CoordinationDiagnosticsService$$Lambda$1189/0x000000080148e7a8@72067920] with extra delay of [217855ms]
[2022-10-31T21:16:41,860][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: from [3359474ms] to [3402394ms]
[2022-10-31T21:16:41,860][TRACE][o.e.c.u.c.DeterministicTaskQueue] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] advanceTime: no longer deferred: DeferredTask[executionTimeMillis=3402394, task={node0}{t71ERgAAQACqOqV0AAAAAA}: publish change of cluster state from version [1] in term [7] to version [2] in term [18]]
[2022-10-31T21:16:41,860][INFO ][o.e.c.c.CoordinatorTests ] [testSingleNodeDiscoveryStabilisesEvenWhenDisrupted] --&gt; runFor(3382178ms) completed run until [3382178ms]: stabilising

From my rather inexperienced understanding, and looking at the above comparisons, I believe that the specific seed makes the test cluster unstable for much longer than expected in the test's runFor() formula. I see it because we see Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: incoming term 14 does not match current term 15 as the last thing of the failed test, but we also see it in the middle of the successful one (at exactly the same spot of advanceTime: from [3073086ms] to [3080013ms], which hints that the tests have done the same steps so far) which goes on with more failing terms, until it finally succeeds.

I hope @DaveCTurner may be able to shed more light on how to continue on this one.

Oct 31 '22 17:10 kingherc

This seems to be a genuine (but very low-impact) bug. In the failing run the delay variability is set to a massive 241577ms (i.e. just over 4 minutes). The ElectionScheduler is triggering elections much more frequently than this so there's lots of concurrent elections in flight. Many of these elections pass the request_pre_vote stage and increment the term, which causes earlier elections to fail, but are then superseded by a later election before they can complete. It's something of a coincidence that one more delayVariabilityMillis is enough to complete an election. It could take many more rounds in theory.

However when using single-node discovery there should be no need to schedule multiple elections like this because there's only one node involved so the first election should succeed. Maybe something like this is enough?

diff --git a/server/src/main/java/org/elasticsearch/cluster/coordination/Coordinator.java b/server/src/main/java/org/elasticsearch/cluster/coordination/Coordinator.java
index 33e3528246d..7d9b0b666e3 100644
--- a/server/src/main/java/org/elasticsearch/cluster/coordination/Coordinator.java
+++ b/server/src/main/java/org/elasticsearch/cluster/coordination/Coordinator.java
@@ -1628,6 +1628,10 @@ public class Coordinator extends AbstractLifecycleComponent implements ClusterSt
                         }

                         if (prevotingRound != null) {
+                            if (singleNodeDiscovery) {
+                                logger.debug("skip prevoting as single-node discovery is enabled and prevoting is already in progress");
+                                return;
+                            }
                             prevotingRound.close();
                         }
                         prevotingRound = preVoteCollector.start(lastAcceptedState, getDiscoveredNodes());

Nov 02 '22 12:11 DaveCTurner

@DaveCTurner wow, nice, thanks for taking a look and understanding it so quickly! I just tried that and it works. Will open a PR. I'm a bit confused with the debug message skip prevoting as single-node discovery is enabled and prevoting is already in progress, because it mentions to skip something while it is already in progress. Will take it as is, but feel free to tell me if it could be better formulated.

Nov 02 '22 16:11 kingherc

I didn't put much thought into the message, feel free to adjust as you see fit :)

Nov 02 '22 16:11 DaveCTurner

I had one of those failures on my feature branch: https://gradle-enterprise.elastic.co/s/lcvafnq7rt4jm/tests/:server:test/org.elasticsearch.cluster.coordination.CoordinatorTests/testSingleNodeDiscoveryStabilisesEvenWhenDisrupted?top-execution=1

I believe I had this fix https://github.com/elastic/elasticsearch/pull/91255 when the failure occured

Nov 04 '22 13:11 idegtiarenko

Thanks Ievgen, this looks to be a problem caused by #91255.

Nov 04 '22 13:11 DaveCTurner

This ticket is haunted. You fix one thing, another thing pops up :) @dct , any reason you believe the PR caused the new failure? I will try to re-produce it now

Nov 04 '22 13:11 kingherc

Yes, the problem in this failure case is that an election happens within runRandomly but fails due to a (simulated) IO exception, but we don't have a way to remove prevotingRound when this happens so the change we made in #91255 just blocks all future election attempts.

Nov 04 '22 14:11 DaveCTurner

I cannot reproduce the new failure on main (I ran 10 iterations of it). @DaveCTurner , how did you understand all that? :) I cannot see any logs in the test failure.

Nov 04 '22 14:11 kingherc

we don't have a way to remove prevotingRound when this happens

Is this something that can happen in reality and thus we introduced a new bug?

Nov 04 '22 14:11 kingherc

I cannot reproduce the new failure on main

The REPRODUCE WITH line works on the feature/desired-balance-allocator branch (but not on main).

Is this something that can happen in reality and thus we introduced a new bug?

Yes I think so.

Nov 04 '22 14:11 DaveCTurner

I think we should revert #91255 because the failure it introduces seems much more frequent than the one it fixes (a few hundred iterations vs a few thousand)

Nov 04 '22 15:11 DaveCTurner

BTW, I also reproduced it in main. Did not take many iterations until it popped up.

I was thinking of how we go to solve both issues. I think either:

Reverting https://github.com/elastic/elasticsearch/pull/91255 and maybe configure the elections to not happen so frequently (e.g., adjusting ELECTION_INITIAL_TIMEOUT_SETTING for this test to be equal to the delay variability).
Or keeping https://github.com/elastic/elasticsearch/pull/91255 and making some special coding for single-node discovery to re-kick prevoting if there was a failure.

@DaveCTurner what do you think? Guessing from your last comment, would the first approach worth trying?

Nov 04 '22 15:11 kingherc

I think the second approach would be more robust, but also much more effort because the plumbing for that failure handling just doesn't exist today. And this is a super-minor bug that almost certainly eventually goes away with enough retries so not really worth the effort.

Therefore I favour the first. I've just pushed bdf34852576159b3919b94979559bee5444f425c which reverts #91255. I'll also push a commit which mutes this test while we continue to work on it.

Nov 04 '22 15:11 DaveCTurner

I think the best setting to adjust would be cluster.election.duration which represents a kind of election timeout after which we start to think about scheduling the next election. Today that cannot be set greater than 5 mins, but with the delay variability in this test going up to 10min sometimes we'd need a much longer max. I think that's ok.

Nov 04 '22 15:11 DaveCTurner

I can test this.

I think that's ok.

You mean it's OK to set the delay variability to up to 5 mins instead of 10mins?

Nov 04 '22 15:11 kingherc

No I mean it should be ok to relax the upper bound on cluster.election.duration. The election takes multiple steps so needs to be much longer than delayVariabilityMillis - I reckon something like delayVariabilityMillis + 4*delayVariabilityMillis + clusterStateUpdateDelay which adds up to 2h10min.

Nov 04 '22 15:11 DaveCTurner

@DaveCTurner , so I should change the max of the cluster.election.duration to 10mins instead of 5mins in production? Or is there a way I can change the max just for the test?

Nov 04 '22 15:11 kingherc

By the way, I notice that cluster.election.max_timeout also has 5 mins as maximum. Is that related to the election duration max value? If yes, should I change the max_timeout's max value to also be 10 mins?

Nov 04 '22 15:11 kingherc

FYI, setting the cluster.election.duration to delayVariabilityMillis if the latter is bigger than the default duration, worked for one example I had. I will re-try now to run a lot of iterations to make sure now it's stable finally.

Nov 04 '22 15:11 kingherc

OK, after many iterations (>1000) it looks good. I will open the PR.

Nov 04 '22 17:11 kingherc

elasticsearch elasticsearch copied to clipboard

Failure in CoordinatorTests.testSingleNodeDiscoveryStabilisesEvenWhenDisrupted

CI Link

Repro line

Does it reproduce?

Applicable branches

Failure history

Failure excerpt

elasticsearch
elasticsearch copied to clipboard