OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

[BUG] org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testRestartPrimary_NoReplicas is flaky

Open reta opened this issue 1 year ago • 1 comments

Describe the bug

The test case org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testRestartPrimary_NoReplicas if flaky:

Jan 23, 2024 10:50:42 AM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
WARNUNG: Uncaught exception in thread: Thread[#1383,opensearch[node_t2][remote_refresh_retry][T#1],5,TGRP-SegmentReplicationUsingRemoteStoreIT]
org.opensearch.core.concurrency.OpenSearchRejectedExecutionException: rejected execution of java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@582274dc[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@69017baa[Wrapped task = [threaded] org.opensearch.index.shard.ReleasableRetryableRefreshListener$$Lambda/0x00007f2e10b968f0@13593fdf]] on org.opensearch.threadpool.Scheduler$SafeScheduledThreadPoolExecutor@70886659[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 28]
	at __randomizedtesting.SeedInfo.seed([59421F2A5DE382B]:0)
	at org.opensearch.common.util.concurrent.OpenSearchAbortPolicy.rejectedExecution(OpenSearchAbortPolicy.java:67)
	at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:841)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:340)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:562)
	at org.opensearch.threadpool.ThreadPool.schedule(ThreadPool.java:488)
	at org.opensearch.index.shard.ReleasableRetryableRefreshListener.scheduleRetry(ReleasableRetryableRefreshListener.java:125)
	at org.opensearch.index.shard.ReleasableRetryableRefreshListener.scheduleRetry(ReleasableRetryableRefreshListener.java:178)
	at org.opensearch.index.shard.ReleasableRetryableRefreshListener.runAfterRefreshWithPermit(ReleasableRetryableRefreshListener.java:167)
	at org.opensearch.index.shard.ReleasableRetryableRefreshListener.lambda$scheduleRetry$2(ReleasableRetryableRefreshListener.java:126)
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:854)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)

Related component

Other

To Reproduce

./gradlew ':server:internalClusterTest' --tests "org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testRestartPrimary_NoReplicas" -Dtests.seed=59421F2A5DE382B

Expected behavior

The test must always pass

Additional Details

Plugins Standard

Screenshots If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • CI

Additional context

  • https://build.ci.opensearch.org/job/gradle-check/32505/testReport/junit/org.opensearch.remotestore/SegmentReplicationUsingRemoteStoreIT/testRestartPrimary_NoReplicas/

reta avatar Jan 23 '24 16:01 reta

[Triage - attendees 1 2 3] @reta Thanks for filing

peternied avatar Jan 24 '24 16:01 peternied