elasticsearch icon indicating copy to clipboard operation
elasticsearch copied to clipboard

[CI] TermsDocCountErrorIT testStringValueFieldSingleShard failing

Open kingherc opened this issue 1 year ago • 5 comments

Build scan: https://gradle-enterprise.elastic.co/s/nmvhirxqjhgio/tests/:server:internalClusterTest/org.elasticsearch.search.aggregations.bucket.TermsDocCountErrorIT/testStringValueFieldSingleShard

Reproduction line:

./gradlew ':server:internalClusterTest' --tests "org.elasticsearch.search.aggregations.bucket.TermsDocCountErrorIT.testStringValueFieldSingleShard" -Dtests.seed=BC1F3C6CF02B639F -Dtests.locale=ar-YE -Dtests.timezone=SystemV/PST8 -Druntime.java=17 -Dtests.fips.enabled=true

Applicable branches: main

Reproduces locally?: Didn't try

Failure history: Failure dashboard for org.elasticsearch.search.aggregations.bucket.TermsDocCountErrorIT#testStringValueFieldSingleShard

Failure excerpt:

java.lang.AssertionError: 
Expected: <0L>
     but: was <2L>

  at __randomizedtesting.SeedInfo.seed([BC1F3C6CF02B639F:ADBD5BE50A343DAC]:0)
  at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
  at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
  at org.elasticsearch.test.ESTestCase.assertThat(ESTestCase.java:2119)
  at org.elasticsearch.search.aggregations.bucket.TermsDocCountErrorIT.assertNoDocCountError(TermsDocCountErrorIT.java:222)
  at org.elasticsearch.search.aggregations.bucket.TermsDocCountErrorIT.lambda$testStringValueFieldSingleShard$2(TermsDocCountErrorIT.java:310)
  at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.lambda$assertNoFailuresAndResponse$9(ElasticsearchAssertions.java:354)
  at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertResponse(ElasticsearchAssertions.java:375)
  at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertNoFailuresAndResponse(ElasticsearchAssertions.java:352)
  at org.elasticsearch.search.aggregations.bucket.TermsDocCountErrorIT.lambda$testStringValueFieldSingleShard$3(TermsDocCountErrorIT.java:301)
  at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.lambda$assertNoFailuresAndResponse$9(ElasticsearchAssertions.java:354)
  at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertResponse(ElasticsearchAssertions.java:375)
  at org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertNoFailuresAndResponse(ElasticsearchAssertions.java:352)
  at org.elasticsearch.search.aggregations.bucket.TermsDocCountErrorIT.testStringValueFieldSingleShard(TermsDocCountErrorIT.java:292)
  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-2)
  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
  at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:568)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
  at java.lang.Thread.run(Thread.java:833)

kingherc avatar Feb 14 '24 14:02 kingherc

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine avatar Feb 14 '24 14:02 elasticsearchmachine

I had a look and this test seems to be failing very rarely. I was not able to reproduce locally but setting low risk as it seems very rare

iverase avatar Feb 15 '24 13:02 iverase

Failed again on the 14th and 29th March and on the 2nd April. Still a very rare occurrence (3/13683), but does seem to happen occasionally. I was also unable to reproduce it locally with the same seed that failed on the 2nd, -Dtests.seed=F0D1AA5A946FFC52.

craigtaverner avatar Apr 02 '24 09:04 craigtaverner

There was also a (rare) failure on testDoubleValueField, not sure if it's related:

https://gradle-enterprise.elastic.co/s/hrda2ynmljf3m

kkrik-es avatar Apr 16 '24 06:04 kkrik-es

The failure above is a different issue addressed in https://github.com/elastic/elasticsearch/issues/107535

iverase avatar Apr 16 '24 13:04 iverase

I had a look into this failure and it seems it has been introduced by concurrency. The conditions for a terms aggregation are very particular, so this test rarely runs concurrently, but when it happens, then it might fail.

One way to fix it would be to force merge to 1 segment the index "idx_single_shard" but I wonder if that might remove some intersting code paths from the tests.

iverase avatar Jun 27 '24 10:06 iverase

Actually, my understanding is that we only enable concurrency when the result is exact, so this is unexpected.

@craigtaverner I am going to take this one.

iverase avatar Jul 01 '24 13:07 iverase