ml-commons [BUG] Flaky IT testPredictionWithSearchInput

[BUG] Flaky IT testPredictionWithSearchInput_LogisticRegression

Open mingshl opened this issue 9 months ago • 2 comments

What is the bug? saw this flaky test in many PR throwing ConcurrentModificationException

testPredictionWithSearchInput_LogisticRegression

How can one reproduce the bug?

2> REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:test' --tests "org.opensearch.ml.action.prediction.PredictionITTests.testPredictionWithDataFrame_LinearRegression" -Dtests.seed=F80C5CAF7767D1EF -Dtests.security.manager=false -Dtests.locale=fr-GA -Dtests.timezone=Asia/Dacca -Druntime.java=21 2> java.util.ConcurrentModificationException at __randomizedtesting.SeedInfo.seed([F80C5CAF7767D1EF:D2528943D7C22119]:0) at java.base/java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1792) at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:762) at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:276) at java.base/java.util.WeakHashMap$ValueSpliterator.forEachRemaining(WeakHashMap.java:1223) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) at org.apache.logging.log4j.core.LoggerContext.updateLoggers(LoggerContext.java:776) at org.apache.logging.log4j.core.LoggerContext.updateLoggers(LoggerContext.java:766) at org.opensearch.common.logging.Loggers.removeAppender(Loggers.java:176) at org.opensearch.test.OpenSearchTestCase.removeHeaderWarningAppender(OpenSearchTestCase.java:411) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) at java.base/java.lang.reflect.Method.invoke(Method.java:580) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996) at org.opensearch.test.OpenSearchTestClusterRule$1.evaluate(OpenSearchTestClusterRule.java:369) at org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48) at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43) at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45) at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60) at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883) 1> [2025-03-01T11:34:06,688][INFO ][o.o.p.PluginsService ] [node_s0] PluginService:onIndexModule index:[.plugins-ml-model/bPMhMgmUSayU9aN5lzMSXw] 1> [2025-03-01T11:34:06,782][INFO ][o.o.m.e.i.MLIndicesHandler] [node_s0] create index:.plugins-ml-model 1> [2025-03-01T11:34:06,784][WARN ][o.o.c.r.a.AllocationService] [node_s1] Falling back to single shard assignment since batch mode disable or multiple custom allocators set 1> [2025-03-01T11:34:06,800][INFO ][o.o.p.PluginsService ] [node_s1] PluginService:onIndexModule index:[.plugins-ml-model/bPMhMgmUSayU9aN5lzMSXw] 1> [2025-03-01T11:34:06,847][INFO ][o.o.m.t.MLTrainingTaskRunner] [node_s0] Model saved into index, result:CREATED, model id: XPsyUJUBh0J32Yqzk3zB 1> [2025-03-01T11:34:07,051][INFO ][o.o.i.r.RecoverySourceHandler] [node_s0] [.plugins-ml-model][0][recover to node_s1] finalizing recovery took [38.1ms] 1> [2025-03-01T11:34:07,055][INFO ][o.o.c.r.a.AllocationService] [node_s1] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.plugins-ml-model][0]]]). 1> [2025-03-01T11:34:07,088][WARN ][o.o.c.r.a.AllocationService] [node_s1] Falling back to single shard assignment since batch mode disable or multiple custom allocators set 1> [2025-03-01T11:34:07,293][INFO ][o.o.m.t.MLTrainingTaskRunner] [node_s1] Model saved into index, result:CREATED, model id: XfsyUJUBh0J32YqzlXyO 1> [2025-03-01T11:34:07,474][INFO ][o.o.m.t.MLTrainingTaskRunner] [node_s0] Model saved into index, result:CREATED, model id: XvsyUJUBh0J32YqzlnxM 1> [2025-03-01T11:34:07,853][INFO ][o.o.m.t.MLTrainingTaskRunner] [node_s0] Model saved into index, result:CREATED, model id: X_syUJUBh0J32Yqzl3y5 1> [2025-03-01T11:34:07,943][INFO ][o.o.m.a.p.PredictionITTests] [testPredictionWithDataFrame_LinearRegression] after test 1> [2025-03-01T11:34:07,945][INFO ][o.o.t.OpenSearchTestClusterRule] [testPredictionWithDataFrame_LinearRegression] [PredictionITTests#testPredictionWithDataFrame_LinearRegression]: cleaning up after test 1> [2025-03-01T11:34:07,975][WARN ][o.o.c.r.a.AllocationService] [node_s1] Falling back to single shard assignment since batch mode disable or multiple custom allocators set 1> [2025-03-01T11:34:07,990][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_s1] [iris_data_for_prediction_it/eYJbr-yzRheiMzeHOyf_dg] deleting index 1> [2025-03-01T11:34:07,990][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_s1] [.plugins-ml-model/bPMhMgmUSayU9aN5lzMSXw] deleting index 1> [2025-03-01T11:34:07,990][WARN ][o.o.c.r.a.AllocationService] [node_s1] Falling back to single shard assignment since batch mode disable or multiple custom allocators set

What is the expected behavior? test pass

What is your host/environment?

OS: [e.g. iOS]
Version [e.g. 22]
Plugins

Do you have any screenshots? If applicable, add screenshots to help explain your problem.

Do you have any additional context? Add any other context about the problem.

Mar 01 '25 18:03 mingshl

ml-commons ml-commons copied to clipboard

[BUG] Flaky IT testPredictionWithSearchInput_LogisticRegression

ml-commons
ml-commons copied to clipboard