OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

[AUTOCUT] Gradle Check Flaky Test Report for RemoteSplitIndexIT

Open opensearch-ci-bot opened this issue 1 year ago • 1 comments

Flaky Test Report for RemoteSplitIndexIT

Noticed the RemoteSplitIndexIT has some flaky, failing tests that failed during post-merge actions.

Details

Git Reference Merged Pull Request Build Details Test Name
1386a9b902c4af0e3cb88a6e7e16861970415b76 13930 39885 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.classMethod
392e666e0e13dfc16923e8476e2c42f19a82c818 14487 41458 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testSplitIndexPrimaryTerm
563375de28b16870ab42b9fb4260127598d47d91 14187 41622 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.classMethod
5ad0f5dc1303bf63c973cd93987077d9748ab167 14203 40762 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndex
5b1945419dc8da8b1ce1ce46cb1f163e61b08018 13801 39904 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndex
784f7d3264eb7efdb3c4597d2e72650ddbd5d39f 14214 40795 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.classMethod
913013bd5c6b43d8337a97a7753bc2f10f36eae4 13948 39666 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.classMethod
a6c86e7774df984f45b506830c9bf581746e92de 13906 39544 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndex
b9ca5a8e24673ed38cab736ffbd57479de241553 14027 40019 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.classMethod
bc39354db3923b2aedd13c15d92f51d2038b3489 14124 40730 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndex
c38dfef55610aa5a1f713b22e54e2f282f327ef0 14195 40750 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndex
d2757f74e50b66b7cda494f6d06936d6d32bb2c1 14250 40921 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndex
d56d8c88e07ae416d41197b05103ea2dba393967 14489 41572 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndex
f2fd8047e1669395742657760ea35c11f50368e0 14166 40663 org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndex

The other pull requests, besides those involved in post-merge actions, that contain failing tests with the RemoteSplitIndexIT class are:

For more details on the failed tests refer to OpenSearch Gradle Check Metrics dashboard.

opensearch-ci-bot avatar Jun 13 '24 21:06 opensearch-ci-bot

The "classMethod" failure appears to be due to a file leak:

RemoteSplitIndexIT > classMethod FAILED
    java.lang.RuntimeException: file handle leaks: [InputStream(/var/jenkins/workspace/gradle-check/search/server/build/testrun/internalClusterTest/temp/org.opensearch.action.admin.indices.create.RemoteSplitIndexIT_3726536651049A1A-001/tempDir-002/repos/cjWLgWIQEY/T11011111101111/6z6VohLmTKe07eW5Z98zQA/1/translog/data/2/translog-11.tlog)]
        at __randomizedtesting.SeedInfo.seed([3726536651049A1A]:0)
        at org.apache.lucene.tests.mockfile.LeakFS.onClose(LeakFS.java:63)
        at org.apache.lucene.tests.mockfile.FilterFileSystem.close(FilterFileSystem.java:69)
        at org.apache.lucene.tests.mockfile.FilterFileSystem.close(FilterFileSystem.java:70)
        at org.apache.lucene.tests.util.TestRuleTemporaryFilesCleanup.afterAlways(TestRuleTemporaryFilesCleanup.java:223)
        at com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterAlways(TestRuleAdapter.java:31)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:43)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
        at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
        at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
        at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
        at java.****/java.lang.Thread.run(Thread.java:1583)
./gradlew ':server:internalClusterTest' --tests "org.opensearch.action.admin.indices.create.RemoteSplitIndexIT" -Dtests.seed=3726536651049A1A -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en -Dtests.timezone=Etc/UTC -Druntime.java=21

andrross avatar Jun 17 '24 15:06 andrross

[Catch All Triage - 1, 2, 3, 4, 5]

dblock avatar Sep 09 '24 16:09 dblock

Hi Team, we are also facing same issues with ppc64le

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndex" -Dtests.seed=71FF58527F2E975B -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=fr-CH -Dtests.timezone=Asia/Saigon -Druntime.java=21

RemoteSplitIndexIT > testCreateSplitIndex FAILED UncategorizedExecutionException[Failed execution]; nested: IOException[Failed to upload 2 files during transfer]; at __randomizedtesting.SeedInfo.seed([71FF58527F2E975B:9BF8EE873AE5D59]:0) at app//org.opensearch.action.support.AdapterActionFuture.unwrapEsException(AdapterActionFuture.java:102) at app//org.opensearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:57) at app//org.opensearch.action.ActionRequestBuilder.get(ActionRequestBuilder.java:73) at app//org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.testCreateSplitIndex(RemoteSplitIndexIT.java:414)

    Caused by:
    java.io.IOException: Failed to upload 2 files during transfer
        at org.opensearch.index.translog.transfer.TranslogTransferManager.transferSnapshot(TranslogTransferManager.java:199)
        at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:426)
        at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:409)
        at org.opensearch.index.translog.RemoteFsTranslog.ensureSynced(RemoteFsTranslog.java:341)
        at org.opensearch.index.translog.Translog.ensureSynced(Translog.java:837)
        at org.opensearch.index.translog.InternalTranslogManager.ensureTranslogSynced(InternalTranslogManager.java:184)
        at org.opensearch.index.engine.InternalEngine.ensureTranslogSynced(InternalEngine.java:605)
        at org.opensearch.index.shard.IndexShard.lambda$createTranslogSyncProcessor$44(IndexShard.java:4441)
        at org.opensearch.index.shard.IndexShard$6.write(IndexShard.java:4455)

prachi-gaonkar avatar Oct 15 '24 08:10 prachi-gaonkar

Hi Team is there any update on this issue?

prachi-gaonkar avatar Oct 21 '24 05:10 prachi-gaonkar

While running locally, getting failure in 500 runs

[2025-03-03T19:09:56,894][ERROR][o.o.i.t.t.TranslogTransferManager] [node_t0] [target][1] Exception occurred while cleaning translog at path=[W01011001000101][kJ9uOs4eSG2EhNaSFiwaVA][1][translog][data]
java.io.IOException: access denied: /local/home/gbbafna/git/OpenSearch/server/build/testrun/internalClusterTest/temp/org.opensearch.action.admin.indices.create.RemoteSplitIndexIT_CC16184ED99D441C-001/tempDir-002/repos/tbnAiwbhZD/W01011001000101/kJ9uOs4eSG2EhNaSFiwaVA/1/translog/data/2/translog-45.tlog
	at org.apache.lucene.tests.mockfile.WindowsFS.checkDeleteAccess(WindowsFS.java:117) ~[lucene-test-framework-10.1.0.jar:10.1.0 884954006de769dc43b811267230d625886e6515 - 2024-12-17 16:15:44]
	at org.apache.lucene.tests.mockfile.WindowsFS.delete(WindowsFS.java:126) ~[lucene-test-framework-10.1.0.jar:10.1.0 884954006de769dc43b811267230d625886e6515 - 2024-12-17 16:15:44]
	at java.base/java.nio.file.Files.delete(Files.java:1153) ~[?:?]
	at org.opensearch.common.blobstore.fs.FsBlobContainer$1.visitFile(FsBlobContainer.java:147) ~[main/:?]
	at org.opensearch.common.blobstore.fs.FsBlobContainer$1.visitFile(FsBlobContainer.java:137) ~[main/:?]
	at java.base/java.nio.file.Files.walkFileTree(Files.java:2810) ~[?:?]
	at java.base/java.nio.file.Files.walkFileTree(Files.java:2881) ~[?:?]
	at org.opensearch.common.blobstore.fs.FsBlobContainer.delete(FsBlobContainer.java:137) ~[main/:?]
	at org.opensearch.index.translog.transfer.BlobStoreTransferService.delete(BlobStoreTransferService.java:287) ~[main/:?]
	at org.opensearch.index.translog.transfer.BlobStoreTransferService.lambda$deleteAsync$10(BlobStoreTransferService.java:294) [main/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:935) [main/:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]
	at java.base/java.lang.Thread.run(Thread.java:1575) [?:?]

access denied is a weird error as it should have come consistently .

java.lang.AssertionError: expected:<0> but was:<58>
	at __randomizedtesting.SeedInfo.seed([CC16184ED99D441C:B456CEF4D51D8E1E]:0)
	at org.junit.Assert.fail(Assert.java:89)
	at org.junit.Assert.failNotEquals(Assert.java:835)
	at org.junit.Assert.assertEquals(Assert.java:647)
	at org.junit.Assert.assertEquals(Assert.java:633)
	at org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.lambda$cleanUp$0(RemoteSplitIndexIT.java:117)
	at org.opensearch.test.OpenSearchTestCase.assertBusy(OpenSearchTestCase.java:1136)
	at org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.cleanUp(RemoteSplitIndexIT.java:115)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
	at org.opensearch.test.OpenSearchTestClusterRule$1.evaluate(OpenSearchTestClusterRule.java:369)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
	at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
	at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
	at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
	at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
	at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
	at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
	at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
	at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
	at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
	at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
	at org.junit.rules.RunRules.evaluate(RunRules.java:20)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at java.base/java.lang.Thread.run(Thread.java:1575)
	Suppressed: java.lang.AssertionError: expected:<0> but was:<58>
		at org.junit.Assert.fail(Assert.java:89)
		at org.junit.Assert.failNotEquals(Assert.java:835)
		at org.junit.Assert.assertEquals(Assert.java:647)
		at org.junit.Assert.assertEquals(Assert.java:633)
		at org.opensearch.action.admin.indices.create.RemoteSplitIndexIT.lambda$cleanUp$0(RemoteSplitIndexIT.java:117)
		at org.opensearch.test.OpenSearchTestCase.assertBusy(OpenSearchTestCase.java:1124)
		... 38 more

gbbafna avatar Mar 05 '25 05:03 gbbafna

Closed by https://github.com/opensearch-project/OpenSearch/pull/18329

gbbafna avatar May 19 '25 06:05 gbbafna