elasticsearch
elasticsearch copied to clipboard
[CI] IndexRecoveryIT testDoNotInfinitelyWaitForMapping failing
Build scan: https://gradle-enterprise.elastic.co/s/4d2lhrqhockm6/tests/:server:internalClusterTest/org.elasticsearch.indices.recovery.IndexRecoveryIT/testDoNotInfinitelyWaitForMapping
Reproduction line:
./gradlew ':server:internalClusterTest' --tests "org.elasticsearch.indices.recovery.IndexRecoveryIT.testDoNotInfinitelyWaitForMapping" -Dtests.seed=40853F21F419B395 -Dtests.jvm.argline="-Des.concurrent_search=true" -Dtests.locale=id-ID -Dtests.timezone=Asia/Jerusalem -Druntime.java=21
Applicable branches: main
Reproduces locally?: Didn't try
Failure history:
Failure dashboard for org.elasticsearch.indices.recovery.IndexRecoveryIT#testDoNotInfinitelyWaitForMapping
Failure excerpt:
org.elasticsearch.index.engine.RefreshFailedEngineException: Refresh failed
at __randomizedtesting.SeedInfo.seed([40853F21F419B395:C12454B136DD2E6B]:0)
at org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:2077)
at org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:2016)
at org.elasticsearch.index.engine.EngineTestCase.getDocIds(EngineTestCase.java:1225)
at org.elasticsearch.index.shard.IndexShardTestCase.getDocIdAndSeqNos(IndexShardTestCase.java:925)
at org.elasticsearch.test.InternalTestCluster.lambda$assertSameDocIdsOnShards$16(InternalTestCluster.java:1485)
at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1243)
at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1228)
at org.elasticsearch.test.InternalTestCluster.assertSameDocIdsOnShards(InternalTestCluster.java:1462)
at org.elasticsearch.indices.recovery.AbstractIndexRecoveryIntegTestCase.beforeIndexDeletion(AbstractIndexRecoveryIntegTestCase.java:89)
at org.elasticsearch.indices.recovery.IndexRecoveryIT.beforeIndexDeletion(IndexRecoveryIT.java:183)
at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:588)
at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2316)
at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.lang.reflect.Method.invoke(Method.java:580)
at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:1004)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
at java.lang.Thread.run(Thread.java:1583)
Caused by: org.apache.lucene.index.CorruptIndexException: Problem reading index from store(ByteSizeCachingDirectory(ElasticsearchMockDirectoryWrapper(HybridDirectory@/opt/buildkite-agent/builds/bk-agent-prod-gcp-1707109485745743789/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_40853F21F419B395-001/tempDir-005/node_t0/indices/ZNwxG7VvShuwYV78RTjknA/0/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@2c169f59))) (resource=store(ByteSizeCachingDirectory(ElasticsearchMockDirectoryWrapper(HybridDirectory@/opt/buildkite-agent/builds/bk-agent-prod-gcp-1707109485745743789/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_40853F21F419B395-001/tempDir-005/node_t0/indices/ZNwxG7VvShuwYV78RTjknA/0/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@2c169f59))))
at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:165)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:96)
at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:178)
at org.apache.lucene.index.ReadersAndUpdates.getLatestReader(ReadersAndUpdates.java:243)
at org.apache.lucene.index.SoftDeletesRetentionMergePolicy.keepFullyDeletedSegment(SoftDeletesRetentionMergePolicy.java:82)
at org.apache.lucene.index.FilterMergePolicy.keepFullyDeletedSegment(FilterMergePolicy.java:118)
at org.apache.lucene.index.FilterMergePolicy.keepFullyDeletedSegment(FilterMergePolicy.java:118)
at org.apache.lucene.index.ReadersAndUpdates.keepFullyDeletedSegment(ReadersAndUpdates.java:822)
at org.apache.lucene.index.IndexWriter.isFullyDeleted(IndexWriter.java:6078)
at org.apache.lucene.index.IndexWriter.publishFlushedSegment(IndexWriter.java:2895)
at org.apache.lucene.index.IndexWriter.lambda$publishFlushedSegments$26(IndexWriter.java:5919)
at org.apache.lucene.index.DocumentsWriterFlushQueue.innerPurge(DocumentsWriterFlushQueue.java:102)
at org.apache.lucene.index.DocumentsWriterFlushQueue.tryPurge(DocumentsWriterFlushQueue.java:133)
at org.apache.lucene.index.DocumentsWriter.purgeFlushTickets(DocumentsWriter.java:188)
at org.apache.lucene.index.IndexWriter.publishFlushedSegments(IndexWriter.java:5895)
at org.apache.lucene.index.IndexWriter$1.afterSegmentsFlushed(IndexWriter.java:437)
at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:531)
at org.apache.lucene.index.DocumentsWriter.maybeFlush(DocumentsWriter.java:446)
at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:636)
at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:576)
at org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:381)
at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:355)
at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:345)
at org.apache.lucene.index.FilterDirectoryReader.doOpenIfChanged(FilterDirectoryReader.java:112)
at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:170)
at org.elasticsearch.index.engine.ElasticsearchReaderManager.refreshIfNeeded(ElasticsearchReaderManager.java:48)
at org.elasticsearch.index.engine.ElasticsearchReaderManager.refreshIfNeeded(ElasticsearchReaderManager.java:27)
at org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:167)
at org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:240)
at org.elasticsearch.index.engine.InternalEngine$ExternalReaderManager.refreshIfNeeded(InternalEngine.java:462)
at org.elasticsearch.index.engine.InternalEngine$ExternalReaderManager.refreshIfNeeded(InternalEngine.java:442)
at org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:167)
at org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:240)
at org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:2045)
at org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:2016)
at org.elasticsearch.index.engine.EngineTestCase.getDocIds(EngineTestCase.java:1225)
at org.elasticsearch.index.shard.IndexShardTestCase.getDocIdAndSeqNos(IndexShardTestCase.java:925)
at org.elasticsearch.test.InternalTestCluster.lambda$assertSameDocIdsOnShards$16(InternalTestCluster.java:1485)
at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1243)
at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1228)
at org.elasticsearch.test.InternalTestCluster.assertSameDocIdsOnShards(InternalTestCluster.java:1462)
at org.elasticsearch.indices.recovery.AbstractIndexRecoveryIntegTestCase.beforeIndexDeletion(AbstractIndexRecoveryIntegTestCase.java:89)
at org.elasticsearch.indices.recovery.IndexRecoveryIT.beforeIndexDeletion(IndexRecoveryIT.java:183)
at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:588)
at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2316)
at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.lang.reflect.Method.invoke(Method.java:580)
at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:1004)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
at java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.FileNotFoundException: No sub-file with id .kdi found in compound file "_0.cfs" (fileName=_0.kdi files: [_0.pos, .nvm, .fnm, _0.tip, _Lucene90_0.dvd, _0.doc, _0.tim, _Lucene90_0.dvm, _ES87BloomFilter_0.bfm, .fdm, .nvd, _ES87BloomFilter_0.bfi, _0.tmd, .fdx, .fdt])
at org.apache.lucene.codecs.lucene90.Lucene90CompoundReader.openInput(Lucene90CompoundReader.java:170)
at org.apache.lucene.codecs.lucene90.Lucene90PointsReader.<init>(Lucene90PointsReader.java:63)
at org.apache.lucene.codecs.lucene90.Lucene90PointsFormat.fieldsReader(Lucene90PointsFormat.java:74)
at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:152)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:96)
at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:178)
at org.apache.lucene.index.ReadersAndUpdates.getLatestReader(ReadersAndUpdates.java:243)
at org.apache.lucene.index.SoftDeletesRetentionMergePolicy.keepFullyDeletedSegment(SoftDeletesRetentionMergePolicy.java:82)
at org.apache.lucene.index.FilterMergePolicy.keepFullyDeletedSegment(FilterMergePolicy.java:118)
at org.apache.lucene.index.FilterMergePolicy.keepFullyDeletedSegment(FilterMergePolicy.java:118)
at org.apache.lucene.index.ReadersAndUpdates.keepFullyDeletedSegment(ReadersAndUpdates.java:822)
at org.apache.lucene.index.IndexWriter.isFullyDeleted(IndexWriter.java:6078)
at org.apache.lucene.index.IndexWriter.publishFlushedSegment(IndexWriter.java:2895)
at org.apache.lucene.index.IndexWriter.lambda$publishFlushedSegments$26(IndexWriter.java:5919)
at org.apache.lucene.index.DocumentsWriterFlushQueue.innerPurge(DocumentsWriterFlushQueue.java:102)
at org.apache.lucene.index.DocumentsWriterFlushQueue.tryPurge(DocumentsWriterFlushQueue.java:133)
at org.apache.lucene.index.DocumentsWriter.purgeFlushTickets(DocumentsWriter.java:188)
at org.apache.lucene.index.IndexWriter.publishFlushedSegments(IndexWriter.java:5895)
at org.apache.lucene.index.IndexWriter$1.afterSegmentsFlushed(IndexWriter.java:437)
at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:531)
at org.apache.lucene.index.DocumentsWriter.maybeFlush(DocumentsWriter.java:446)
at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:636)
at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:576)
at org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:381)
at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:355)
at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:345)
at org.apache.lucene.index.FilterDirectoryReader.doOpenIfChanged(FilterDirectoryReader.java:112)
at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:170)
at org.elasticsearch.index.engine.ElasticsearchReaderManager.refreshIfNeeded(ElasticsearchReaderManager.java:48)
at org.elasticsearch.index.engine.ElasticsearchReaderManager.refreshIfNeeded(ElasticsearchReaderManager.java:27)
at org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:167)
at org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:240)
at org.elasticsearch.index.engine.InternalEngine$ExternalReaderManager.refreshIfNeeded(InternalEngine.java:462)
at org.elasticsearch.index.engine.InternalEngine$ExternalReaderManager.refreshIfNeeded(InternalEngine.java:442)
at org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:167)
at org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:240)
at org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:2045)
at org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:2016)
at org.elasticsearch.index.engine.EngineTestCase.getDocIds(EngineTestCase.java:1225)
at org.elasticsearch.index.shard.IndexShardTestCase.getDocIdAndSeqNos(IndexShardTestCase.java:925)
at org.elasticsearch.test.InternalTestCluster.lambda$assertSameDocIdsOnShards$16(InternalTestCluster.java:1485)
at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1243)
at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1228)
at org.elasticsearch.test.InternalTestCluster.assertSameDocIdsOnShards(InternalTestCluster.java:1462)
at org.elasticsearch.indices.recovery.AbstractIndexRecoveryIntegTestCase.beforeIndexDeletion(AbstractIndexRecoveryIntegTestCase.java:89)
at org.elasticsearch.indices.recovery.IndexRecoveryIT.beforeIndexDeletion(IndexRecoveryIT.java:183)
at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:588)
at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2316)
at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.lang.reflect.Method.invoke(Method.java:580)
at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:1004)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
at java.lang.Thread.run(Thread.java:1583)
Pinging @elastic/es-distributed (Team:Distributed)
Caused by: java.io.FileNotFoundException: No sub-file with id .kdi found in compound file "_0.cfs" (fileName=_0.kdi files: [_0.pos, .nvm, .fnm, _0.tip, _Lucene90_0.dvd, _0.doc, _0.tim, _Lucene90_0.dvm, _ES87BloomFilter_0.bfm, .fdm, .nvd, _ES87BloomFilter_0.bfi, _0.tmd, .fdx, .fdt])
is a really weird error. I saw it was mentioned in https://github.com/elastic/elasticsearch/issues/99075, but this one was closed without a follow up. I'm going to reassign the issue to the Search team to take a look at it.
Pinging @elastic/es-search (Team:Search)
Marking as a blocker, this is very weird and shouldn't happen and I don't immediately see it being caused by testing parameters or environment.
I dug through 13 of the past failures, every single one of them failed with the exact same failure. If this was a general corruption issue, I would expect various files to be missing from the cfs.
Here its always:
java.io.FileNotFoundException: No sub-file with id .kdi found in compound file "_0.cfs" (fileName=_0.kdi files: [_0.pos, .nvm, .fnm, _0.tip, _Lucene90_0.dvd, _0.doc, _0.tim, _Lucene90_0.dvm, _ES87BloomFilter_0.bfm, .fdm, .nvd, _ES87BloomFilter_0.bfi, _0.tmd, .fdx, .fdt])
The .kdi
file stores the point values for the inner tree.
If point values were being stored, I would also expect to see kdd
and kdm
files, but they are missing as well.
The JVM used in ALL the failing tests was:
Oracle OpenJDK Runtime Environment 17.0.2+8-86
The failure occurred over GCP, AWS, and AMD64 linux runners.
The very first failure occurred on Feb 2, 2024 11:36:10.933, on commit: c1b34c38ab5322b9c4faeb1db21e88a8aa4f7668
Digging into what change there between feb1 & feb 2 with commit
OK, it looks like the first failure was in a PR, that had commit c1b34c38ab5322b9c4faeb1db21e88a8aa4f7668 however, that PR didn't get merged into main until feb 7, so it isn't the cause. This PR was: https://github.com/elastic/elasticsearch/pull/104980
The first commit that failed on the main build: 7d9253ed35bde1493bb905015210dec412324066 on feb 2.
However, Looking at the first failure from the PR, the commit it was based on in github was: https://github.com/elastic/elasticsearch/commits/main/?before=da7bed3584bac69113778654e6ee0f82bdb4d993+35
The plot thickens
We upgraded to Lucene 9.9.2 on Jan 30th, its weird that it took three days for a failure to occur if that was the cause. Because once the test started failing, it failed pretty much ever day until mute.
The other lucene upgrades were many weeks prior.
Looking at the failing test: IndexRecoveryIT.testDoNotInfinitelyWaitForMapping
, it doesn't even have point values.
indicesAdmin().preparePutMapping("test").setSource("test_field", "type=text,analyzer=test_analyzer").get();
I don't think we will write out the point value files if there are not any point values to store.
So, why does the cfs
think there should be point files?
OK, the field info apparently THINKS there should be point values.
This makes me wonder if there is something funky with our metadata point values like seq_no.
I do see this PR near the start of the failures: https://github.com/elastic/elasticsearch/pull/105036
There are only a few things in the PR that caused me to double take, but everything seemed ok.
Looking at the indexing pipeline, the only way I can see us getting into this weird situation is if the field-info says it has indexed points, but there actually aren't any. I could see this happening if we flush before we have finished writing points, but this should not be possible.
I am going to turn on trace logging for this test and unmute it.
I tried downloading and reading the segment information:
08:27:30 [INFO] org.apache.lucene.luke.models.util.IndexUtils: IndexReaders (1 leaf readers) successfully opened. Index path=<path_to_server>/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_40853F21F419B395-001/tempDir-005/node_t0/indices/ZNwxG7VvShuwYV78RTjknA/0/index
Note the path is the same:
server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_40853F21F419B395-001/tempDir-005/node_t0/indices/ZNwxG7VvShuwYV78RTjknA/0/index
But the only thing in the index was the single segments_3
file. That file had nothing in it from what I could tell.
But, there is this segment info:
{es_version=8502001, history_uuid=NQT2gytYSliQ0JYGTOrAdQ, local_checkpoint=-1, max_seq_no=-1, max_unsafe_auto_id_timestamp=-1, translog_uuid=1ixh6l8ySaeylLPdr4nEmA}
I am going unmute this test, but turn on trace logging for our indexing engine.
I was able to get it to fail locally with verbose Lucene & Elasticsearch engine logging turned on: output.log.zip
Time to read 10+MB of logs to see what happened.
Segment 2 refresh and then recovery attempt
1> [2024-05-06T17:56:44,407][TRACE][o.e.i.r.PeerRecoveryTargetService] [node_t2] [test][0] [5] shard folder empty, recovering all files
1> [2024-05-06T17:56:44,407][TRACE][o.e.i.r.PeerRecoveryTargetService] [node_t2] [test][0] local file count [0]
1> [2024-05-06T17:56:44,407][TRACE][o.e.i.r.PeerRecoveryTargetService] [node_t2] [test][0] [internal:index/shard/recovery/start_recovery]: recovery from {node_t0}{KDwOZt6gTHiuY5f5D9xztw}{uwxuBZISTnuVU2ekUQxE0A}{node_t0}{127.0.0.1}{127.0.0.1:14622}{cdfhilmrstw}{8.15.0}{7000099-8505001}
1> DWPT 1 [2024-05-06T14:56:44.407492Z; elasticsearch[node_t0][refresh][T#1]]: flushed: segment=_0 ramUsed=0.077 MB newFlushedSize=0.003 MB docs/MB=2,790.621
1> IW 1 [2024-05-06T14:56:44.407523Z; elasticsearch[node_t0][refresh][T#1]]: create compound file
1> FS 0 [2024-05-06T14:56:44.407680Z; elasticsearch[node_t0][refresh][T#1]]: newOutputStream[WRITE, CREATE_NEW]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0.cfs
1> [2024-05-06T17:56:44,407][TRACE][o.e.i.r.PeerRecoverySourceService] [node_t0] [test][0] starting recovery to {node_t2}{gNu8pGz3T2uW2ze3HLCZ2A}{0jgJoFXYTEOJMpEa7CFbMg}{node_t2}{127.0.0.1}{127.0.0.1:14621}{cdfhilmrstw}{8.15.0}{7000099-8505001}
1> FS 0 [2024-05-06T14:56:44.407803Z; elasticsearch[node_t0][refresh][T#1]]: newOutputStream[WRITE, CREATE_NEW]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0.cfe
1> [2024-05-06T17:56:44,407][TRACE][o.e.i.r.RecoverySourceHandler] [node_t0] [test][0][recover to node_t2] history is retained by retention lock
1> [2024-05-06T17:56:44,407][TRACE][o.e.i.r.RecoverySourceHandler] [node_t0] [test][0][recover to node_t2] performing file-based recovery followed by history replay starting at [0]
1> FS 0 [2024-05-06T14:56:44.412071Z; elasticsearch[node_t0][refresh][T#1]]: newOutputStream[WRITE, CREATE_NEW]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0.si
1> DWPT 1 [2024-05-06T14:56:44.412172Z; elasticsearch[node_t0][refresh][T#1]]: flush time 14.615375 ms
1> IW 1 [2024-05-06T14:56:44.412232Z; elasticsearch[node_t0][refresh][T#1]]: publishFlushedSegment seg-private updates=null
1> IW 1 [2024-05-06T14:56:44.412259Z; elasticsearch[node_t0][refresh][T#1]]: publishFlushedSegment _0(9.10.0):c8:[diagnostics={os=Mac OS X, java.vendor=Oracle Corporation, java.runtime.version=21+35-2513, timestamp=1715007404407, source=flush, lucene.version=9.10.0, os.version=14.4.1, os.arch=aarch64}]:[attributes={Zstd814StoredFieldsFormat.mode=BEST_SPEED}] :id=4iou9q6kmesl3lqlhwa0wb17q
1> BD 1 [2024-05-06T14:56:44.412268Z; elasticsearch[node_t0][refresh][T#1]]: finished packet delGen=1 now completedDelGen=1
1> IW 1 [2024-05-06T14:56:44.412277Z; elasticsearch[node_t0][refresh][T#1]]: publish sets newSegment delGen=1 seg=_0(9.10.0):c8:[diagnostics={os=Mac OS X, java.vendor=Oracle Corporation, java.runtime.version=21+35-2513, timestamp=1715007404407, source=flush, lucene.version=9.10.0, os.version=14.4.1, os.arch=aarch64}]:[attributes={Zstd814StoredFieldsFormat.mode=BEST_SPEED}] :id=4iou9q6kmesl3lqlhwa0wb17q
1> IFD 1 [2024-05-06T14:56:44.412303Z; elasticsearch[node_t0][refresh][T#1]]: now checkpoint "_0(9.10.0):c8:[diagnostics={os=Mac OS X, java.vendor=Oracle Corporation, java.runtime.version=21+35-2513, timestamp=1715007404407, source=flush, lucene.version=9.10.0, os.version=14.4.1, os.arch=aarch64}]:[attributes={Zstd814StoredFieldsFormat.mode=BEST_SPEED}] :id=4iou9q6kmesl3lqlhwa0wb17r" [1 segments ; isCommit = false]
1> IFD 1 [2024-05-06T14:56:44.412312Z; elasticsearch[node_t0][refresh][T#1]]: now delete 0 files: []
1> IFD 1 [2024-05-06T14:56:44.412318Z; elasticsearch[node_t0][refresh][T#1]]: 0 ms to checkpoint
1> IFD 1 [2024-05-06T14:56:44.412330Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0.kdm"
1> IFD 1 [2024-05-06T14:56:44.412335Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0_ES87BloomFilter_0.bfm"
1> IFD 1 [2024-05-06T14:56:44.412336Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0.kdi"
1> IFD 1 [2024-05-06T14:56:44.412337Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0.fdm"
1> IFD 1 [2024-05-06T14:56:44.412338Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0_ES87BloomFilter_0.bfi"
1> IFD 1 [2024-05-06T14:56:44.412338Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0.nvd"
1> IFD 1 [2024-05-06T14:56:44.412339Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0_ES812Postings_0.tmd"
1> IFD 1 [2024-05-06T14:56:44.412340Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0.fdx"
1> IFD 1 [2024-05-06T14:56:44.412340Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0_0.tmd"
1> IFD 1 [2024-05-06T14:56:44.412342Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0.fdt"
1> IFD 1 [2024-05-06T14:56:44.412342Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0_Lucene90_0.dvm"
1> IFD 1 [2024-05-06T14:56:44.412343Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0_ES812Postings_0.pos"
1> IFD 1 [2024-05-06T14:56:44.412344Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0_Lucene90_0.dvd"
1> IFD 1 [2024-05-06T14:56:44.412345Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0.kdd"
1> IFD 1 [2024-05-06T14:56:44.412346Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0_ES812Postings_0.tip"
1> IFD 1 [2024-05-06T14:56:44.412346Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0_0.pos"
1> IFD 1 [2024-05-06T14:56:44.412347Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0_ES812Postings_0.doc"
1> IFD 1 [2024-05-06T14:56:44.412348Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0_ES812Postings_0.tim"
1> IFD 1 [2024-05-06T14:56:44.412349Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0_0.doc"
1> IFD 1 [2024-05-06T14:56:44.412349Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0_0.tip"
1> IFD 1 [2024-05-06T14:56:44.412350Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0_0.tim"
1> IFD 1 [2024-05-06T14:56:44.412351Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0.nvm"
1> IFD 1 [2024-05-06T14:56:44.412352Z; elasticsearch[node_t0][refresh][T#1]]: will delete new file "_0.fnm"
1> IFD 1 [2024-05-06T14:56:44.412354Z; elasticsearch[node_t0][refresh][T#1]]: now delete 23 files: [_0.kdm, _0_ES87BloomFilter_0.bfm, _0.kdi, _0.fdm, _0_ES87BloomFilter_0.bfi, _0.nvd, _0_ES812Postings_0.tmd, _0.fdx, _0_0.tmd, _0.fdt, _0_Lucene90_0.dvm, _0_ES812Postings_0.pos, _0_Lucene90_0.dvd, _0.kdd, _0_ES812Postings_0.tip, _0_0.pos, _0_ES812Postings_0.doc, _0_ES812Postings_0.tim, _0_0.doc, _0_0.tip, _0_0.tim, _0.nvm, _0.fnm]
1> FS 0 [2024-05-06T14:56:44.412555Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0.kdm
1> FS 0 [2024-05-06T14:56:44.412676Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0_ES87BloomFilter_0.bfm
1> FS 0 [2024-05-06T14:56:44.412789Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0.kdi
1> FS 0 [2024-05-06T14:56:44.412919Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0.fdm
1> FS 0 [2024-05-06T14:56:44.413030Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0_ES87BloomFilter_0.bfi
1> FS 0 [2024-05-06T14:56:44.413140Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0.nvd
1> FS 0 [2024-05-06T14:56:44.413252Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0_ES812Postings_0.tmd
1> FS 0 [2024-05-06T14:56:44.413397Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0.fdx
1> FS 0 [2024-05-06T14:56:44.413554Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0_0.tmd
1> FS 0 [2024-05-06T14:56:44.413772Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0.fdt
1> FS 0 [2024-05-06T14:56:44.413895Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0_Lucene90_0.dvm
1> FS 0 [2024-05-06T14:56:44.414023Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0_ES812Postings_0.pos
1> FS 0 [2024-05-06T14:56:44.414149Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0_Lucene90_0.dvd
1> FS 0 [2024-05-06T14:56:44.414291Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0.kdd
1> FS 0 [2024-05-06T14:56:44.414484Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0_ES812Postings_0.tip
1> FS 0 [2024-05-06T14:56:44.414647Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0_0.pos
1> FS 0 [2024-05-06T14:56:44.414853Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0_ES812Postings_0.doc
1> FS 0 [2024-05-06T14:56:44.414982Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0_ES812Postings_0.tim
1> FS 0 [2024-05-06T14:56:44.415086Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0_0.doc
1> FS 0 [2024-05-06T14:56:44.415243Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0_0.tip
1> FS 0 [2024-05-06T14:56:44.415399Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0_0.tim
1> FS 0 [2024-05-06T14:56:44.415547Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0.nvm
1> FS 0 [2024-05-06T14:56:44.415702Z; elasticsearch[node_t0][refresh][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t0/indices/BnczjlEdTay5HKfFDDXjnA/0/index/_0.fnm
1> IW 1 [2024-05-06T14:56:44.415716Z; elasticsearch[node_t0][refresh][T#1]]: now apply all deletes for all segments buffered updates bytesUsed=0 reader pool bytesUsed=0
1> BD 1 [2024-05-06T14:56:44.415720Z; elasticsearch[node_t0][refresh][T#1]]: waitApply: no deletes to apply
1> IW 1 [2024-05-06T14:56:44.416565Z; elasticsearch[node_t0][refresh][T#1]]: incRefDeleter for NRT reader version=7 segments=_0(9.10.0):c8:[diagnostics={os=Mac OS X, java.vendor=Oracle Corporation, java.runtime.version=21+35-2513, timestamp=1715007404407, source=flush, lucene.version=9.10.0, os.version=14.4.1, os.arch=aarch64}]:[attributes={Zstd814StoredFieldsFormat.mode=BEST_SPEED}] :id=4iou9q6kmesl3lqlhwa0wb17r
1> IW 1 [2024-05-06T14:56:44.416595Z; elasticsearch[node_t0][refresh][T#1]]: return reader version=7 reader=StandardDirectoryReader(segments_2:7:nrt _0(9.10.0):c8:[diagnostics={os=Mac OS X, java.vendor=Oracle Corporation, java.runtime.version=21+35-2513, timestamp=1715007404407, source=flush, lucene.version=9.10.0, os.version=14.4.1, os.arch=aarch64}]:[attributes={Zstd814StoredFieldsFormat.mode=BEST_SPEED}] :id=4iou9q6kmesl3lqlhwa0wb17r)
1> DW 1 [2024-05-06T14:56:44.416602Z; elasticsearch[node_t0][refresh][T#1]]: elasticsearch[node_t0][refresh][T#1] finishFullFlush success=true
1> MP 1 [2024-05-06T14:56:44.416711Z; elasticsearch[node_t0][refresh][T#1]]: seg=_0(9.10.0):c8:[diagnostics={os=Mac OS X, java.vendor=Oracle Corporation, java.runtime.version=21+35-2513, timestamp=1715007404407, source=flush, lucene.version=9.10.0, os.version=14.4.1, os.arch=aarch64}]:[attributes={Zstd814StoredFieldsFormat.mode=BEST_SPEED}] :id=4iou9q6kmesl3lqlhwa0wb17r size=0.004 MB [floored]
1> MP 1 [2024-05-06T14:56:44.416718Z; elasticsearch[node_t0][refresh][T#1]]: findMerges: 1 segments
1> MP 1 [2024-05-06T14:56:44.416722Z; elasticsearch[node_t0][refresh][T#1]]: allowedSegmentCount=10 vs count=1 (eligible count=1)
1> IW 1 [2024-05-06T14:56:44.416732Z; elasticsearch[node_t0][refresh][T#1]]: getReader took 19 ms
1> IFD 1 [2024-05-06T14:56:44.416869Z; elasticsearch[node_t0][refresh][T#1]]: now delete 0 files: []
1> IW 1 [2024-05-06T14:56:44.416889Z; elasticsearch[node_t0][refresh][T#1]]: decRefDeleter for NRT reader version=5 segments=
1> [2024-05-06T17:56:44,417][DEBUG][o.e.i.r.RecoverySourceHandler] [node_t0] [test][0][recover to node_t2] no peer-recovery retention lease for [LycXT1dmS1q1c3GTALV7ZA]
1> [2024-05-06T17:56:44,417][TRACE][o.e.i.r.RecoverySourceHandler] [node_t0] [test][0][recover to node_t2] recovery [phase1]: recovering [segments_2] from peer, does not exist in remote
1> [2024-05-06T17:56:44,417][TRACE][o.e.i.r.RecoverySourceHandler] [node_t0] [test][0][recover to node_t2] recovery [phase1]: total_size[227b], recovering_files [1] with total_size [227b] from peer, recovering_files [0] with total_size [0b] from snapshot, reusing_files [0] with total_size [0b]
1> FS 0 [2024-05-06T14:56:44.419177Z; elasticsearch[node_t2][generic][T#4]]: newOutputStream[WRITE, CREATE_NEW]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/recovery.sxNv9dIqSwawn6ZxnXUqMA.segments_2
1> FS 0 [2024-05-06T14:56:44.419465Z; elasticsearch[node_t2][generic][T#4]]: newFileChannel[WRITE]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/recovery.sxNv9dIqSwawn6ZxnXUqMA.segments_2
1> [2024-05-06T17:56:44,419][TRACE][o.e.i.r.RecoverySourceHandler] [node_t0] [test][0][recover to node_t2] cloning primary's retention lease for target node ID [gNu8pGz3T2uW2ze3HLCZ2A]
1> [2024-05-06T17:56:44,419][TRACE][o.e.i.r.RecoverySourceHandler] [node_t0] [test][0][recover to node_t2] cloned primary's retention lease as [RetentionLease{id='peer_recovery/gNu8pGz3T2uW2ze3HLCZ2A', retainingSequenceNumber=0, timestamp=1715007404229, source='peer recovery'}]
1> [2024-05-06T17:56:44,420][INFO ][o.e.i.r.IndexRecoveryIT ] [testDoNotInfinitelyWaitForMapping] [IndexRecoveryIT#testDoNotInfinitelyWaitForMapping {seed=[90F1B792EA6C1C0E:56C0D86A80589985]}]: cleaning up after test
1> [2024-05-06T17:56:44,420][TRACE][o.e.i.r.IndexRecoveryIT ] [testDoNotInfinitelyWaitForMapping] Check consistency for [3] nodes
Then on node_t2
1> FS 0 [2024-05-06T14:56:44.421766Z; elasticsearch[node_t2][generic][T#3]]: createDirectory: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index (FAILED: java.nio.file.FileAlreadyExistsException: /Users/benjamintrent/Projects/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index)
1> FS 0 [2024-05-06T14:56:44.421834Z; elasticsearch[node_t2][generic][T#3]]: newByteChannel[WRITE, CREATE_NEW]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/write.lock (FAILED: java.nio.file.FileAlreadyExistsException: /Users/benjamintrent/Projects/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/write.lock)
1> FS 0 [2024-05-06T14:56:44.421955Z; elasticsearch[node_t2][generic][T#3]]: newFileChannel[CREATE, WRITE]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/write.lock
1> FS 0 [2024-05-06T14:56:44.422013Z; elasticsearch[node_t2][generic][T#3]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/segments_2 (FAILED: java.nio.file.NoSuchFileException: /Users/benjamintrent/Projects/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/segments_2)
1> FS 0 [2024-05-06T14:56:44.422164Z; elasticsearch[node_t2][generic][T#3]]: move[ATOMIC_MOVE]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/recovery.sxNv9dIqSwawn6ZxnXUqMA.segments_2 -> temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/segments_2
1> FS 0 [2024-05-06T14:56:44.422284Z; elasticsearch[node_t2][generic][T#3]]: createDirectory: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index (FAILED: java.nio.file.FileAlreadyExistsException: /Users/benjamintrent/Projects/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index)
1> FS 0 [2024-05-06T14:56:44.422326Z; elasticsearch[node_t2][generic][T#3]]: newByteChannel[WRITE, CREATE_NEW]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/write.lock (FAILED: java.nio.file.FileAlreadyExistsException: /Users/benjamintrent/Projects/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/write.lock)
1> FS 0 [2024-05-06T14:56:44.422433Z; elasticsearch[node_t2][generic][T#3]]: newFileChannel[CREATE, WRITE]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/write.lock
1> FS 0 [2024-05-06T14:56:44.423343Z; elasticsearch[node_t2][generic][T#3]]: createDirectory: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog
1> FS 0 [2024-05-06T14:56:44.423542Z; elasticsearch[node_t2][generic][T#3]]: newFileChannel[WRITE, CREATE_NEW]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog.ckp
1> FS 0 [2024-05-06T14:56:44.423808Z; elasticsearch[node_t2][generic][T#3]]: newFileChannel[WRITE, READ, CREATE_NEW]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog-1.tlog
1> FS 0 [2024-05-06T14:56:44.423895Z; elasticsearch[node_t2][generic][T#3]]: newFileChannel[WRITE]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog.ckp
1> FS 0 [2024-05-06T14:56:44.424124Z; elasticsearch[node_t2][generic][T#3]]: createDirectory: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index (FAILED: java.nio.file.FileAlreadyExistsException: /Users/benjamintrent/Projects/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index)
1> FS 0 [2024-05-06T14:56:44.424183Z; elasticsearch[node_t2][generic][T#3]]: newByteChannel[WRITE, CREATE_NEW]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/write.lock (FAILED: java.nio.file.FileAlreadyExistsException: /Users/benjamintrent/Projects/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/write.lock)
1> FS 0 [2024-05-06T14:56:44.424335Z; elasticsearch[node_t2][generic][T#3]]: newFileChannel[CREATE, WRITE]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/write.lock
1> IFD 1 [2024-05-06T14:56:44.424631Z; elasticsearch[node_t2][generic][T#3]]: init: current segments file is "segments_2"; deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@f05914a
1> IFD 1 [2024-05-06T14:56:44.424644Z; elasticsearch[node_t2][generic][T#3]]: init: load commit "segments_2"
1> IFD 1 [2024-05-06T14:56:44.424805Z; elasticsearch[node_t2][generic][T#3]]: now delete 0 files: []
1> IFD 1 [2024-05-06T14:56:44.424818Z; elasticsearch[node_t2][generic][T#3]]: now checkpoint "" [0 segments ; isCommit = false]
1> IFD 1 [2024-05-06T14:56:44.424824Z; elasticsearch[node_t2][generic][T#3]]: now delete 0 files: []
1> IFD 1 [2024-05-06T14:56:44.424827Z; elasticsearch[node_t2][generic][T#3]]: 0 ms to checkpoint
1> IW 1 [2024-05-06T14:56:44.424835Z; elasticsearch[node_t2][generic][T#3]]: init: create=false reader=null
1> IW 1 [2024-05-06T14:56:44.424878Z; elasticsearch[node_t2][generic][T#3]]:
New segment_3
1> IW 1 [2024-05-06T14:56:44.424905Z; elasticsearch[node_t2][generic][T#3]]: MMapDirectory.UNMAP_SUPPORTED=true
1> IW 1 [2024-05-06T14:56:44.424917Z; elasticsearch[node_t2][generic][T#3]]: commit: start
1> IW 1 [2024-05-06T14:56:44.424923Z; elasticsearch[node_t2][generic][T#3]]: commit: enter lock
1> IW 1 [2024-05-06T14:56:44.424924Z; elasticsearch[node_t2][generic][T#3]]: commit: now prepare
1> IW 1 [2024-05-06T14:56:44.424926Z; elasticsearch[node_t2][generic][T#3]]: prepareCommit: flush
1> IW 1 [2024-05-06T14:56:44.424929Z; elasticsearch[node_t2][generic][T#3]]: index before flush
1> DW 1 [2024-05-06T14:56:44.424948Z; elasticsearch[node_t2][generic][T#3]]: startFullFlush
1> IW 1 [2024-05-06T14:56:44.424975Z; elasticsearch[node_t2][generic][T#3]]: now apply all deletes for all segments buffered updates bytesUsed=0 reader pool bytesUsed=0
1> BD 1 [2024-05-06T14:56:44.424982Z; elasticsearch[node_t2][generic][T#3]]: waitApply: no deletes to apply
1> DW 1 [2024-05-06T14:56:44.425010Z; elasticsearch[node_t2][generic][T#3]]: elasticsearch[node_t2][generic][T#3] finishFullFlush success=true
1> IW 1 [2024-05-06T14:56:44.425018Z; elasticsearch[node_t2][generic][T#3]]: startCommit(): start
1> IW 1 [2024-05-06T14:56:44.425024Z; elasticsearch[node_t2][generic][T#3]]: startCommit index= changeCount=2
1> FS 0 [2024-05-06T14:56:44.425414Z; elasticsearch[node_t2][generic][T#3]]: newOutputStream[WRITE, CREATE_NEW]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/pending_segments_3
1> FS 0 [2024-05-06T14:56:44.425678Z; elasticsearch[node_t2][generic][T#3]]: newFileChannel[WRITE]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/pending_segments_3
1> IW 1 [2024-05-06T14:56:44.425700Z; elasticsearch[node_t2][generic][T#3]]: startCommit: wrote pending segments file "pending_segments_3"
1> IW 1 [2024-05-06T14:56:44.425721Z; elasticsearch[node_t2][generic][T#3]]: done all syncs: []
1> IW 1 [2024-05-06T14:56:44.425728Z; elasticsearch[node_t2][generic][T#3]]: commit: pendingCommit != null
1> FS 0 [2024-05-06T14:56:44.425885Z; elasticsearch[node_t2][generic][T#3]]: move[ATOMIC_MOVE]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/pending_segments_3 -> temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/segments_3
1> IW 1 [2024-05-06T14:56:44.426022Z; elasticsearch[node_t2][generic][T#3]]: commit: done writing segments file "segments_3"
1> IFD 1 [2024-05-06T14:56:44.426031Z; elasticsearch[node_t2][generic][T#3]]: now checkpoint "" [0 segments ; isCommit = true]
1> IFD 1 [2024-05-06T14:56:44.426038Z; elasticsearch[node_t2][generic][T#3]]: deleteCommits: now decRef commit "segments_2"
1> IFD 1 [2024-05-06T14:56:44.426048Z; elasticsearch[node_t2][generic][T#3]]: now delete 1 files: [segments_2]
1> FS 0 [2024-05-06T14:56:44.426175Z; elasticsearch[node_t2][generic][T#3]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/segments_2
1> IFD 1 [2024-05-06T14:56:44.426183Z; elasticsearch[node_t2][generic][T#3]]: 0 ms to checkpoint
1> IFD 1 [2024-05-06T14:56:44.426186Z; elasticsearch[node_t2][generic][T#3]]: now delete 0 files: []
1> IW 1 [2024-05-06T14:56:44.426209Z; elasticsearch[node_t2][generic][T#3]]: commit: took 1.3 msec
1> IW 1 [2024-05-06T14:56:44.426213Z; elasticsearch[node_t2][generic][T#3]]: commit: done
1> IW 1 [2024-05-06T14:56:44.426217Z; elasticsearch[node_t2][generic][T#3]]: rollback
1> IW 1 [2024-05-06T14:56:44.426234Z; elasticsearch[node_t2][generic][T#3]]: all running merges have aborted
1> IW 1 [2024-05-06T14:56:44.426238Z; elasticsearch[node_t2][generic][T#3]]: rollback: done finish merges
1> DW 1 [2024-05-06T14:56:44.426246Z; elasticsearch[node_t2][generic][T#3]]: abort
1> DW 1 [2024-05-06T14:56:44.426261Z; elasticsearch[node_t2][generic][T#3]]: done abort success=true
1> IW 1 [2024-05-06T14:56:44.426277Z; elasticsearch[node_t2][generic][T#3]]: rollback: infos=
1> IFD 1 [2024-05-06T14:56:44.426283Z; elasticsearch[node_t2][generic][T#3]]: now checkpoint "" [0 segments ; isCommit = false]
1> IFD 1 [2024-05-06T14:56:44.426285Z; elasticsearch[node_t2][generic][T#3]]: now delete 0 files: []
1> IFD 1 [2024-05-06T14:56:44.426287Z; elasticsearch[node_t2][generic][T#3]]: 0 ms to checkpoint
1> IFD 1 [2024-05-06T14:56:44.426415Z; elasticsearch[node_t2][generic][T#3]]: now delete 0 files: []
1> FS 0 [2024-05-06T14:56:44.426786Z; elasticsearch[node_t2][generic][T#3]]: newOutputStream[WRITE, CREATE_NEW]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/_state/retention-leases-0.st.tmp
1> FS 0 [2024-05-06T14:56:44.426947Z; elasticsearch[node_t2][generic][T#3]]: newFileChannel[WRITE]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/_state/retention-leases-0.st.tmp
1> FS 0 [2024-05-06T14:56:44.427152Z; elasticsearch[node_t2][generic][T#3]]: move[ATOMIC_MOVE]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/_state/retention-leases-0.st.tmp -> temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/_state/retention-leases-0.st
1> [2024-05-06T17:56:44,428][TRACE][o.e.i.r.RecoverySourceHandler] [node_t0] [test][0][recover to node_t2] recovery [phase1]: took [0s]
1> [2024-05-06T17:56:44,428][TRACE][o.e.i.r.RecoverySourceHandler] [node_t0] [test][0][recover to node_t2] recovery [phase1]: prepare remote engine for translog
1> FS 0 [2024-05-06T14:56:44.431180Z; elasticsearch[node_t2][generic][T#1]]: createDirectory: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog (FAILED: java.nio.file.FileAlreadyExistsException: /Users/benjamintrent/Projects/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog)
1> FS 0 [2024-05-06T14:56:44.431606Z; elasticsearch[node_t2][generic][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog-0.tlog (FAILED: java.nio.file.NoSuchFileException: /Users/benjamintrent/Projects/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog-0.tlog)
1> FS 0 [2024-05-06T14:56:44.431643Z; elasticsearch[node_t2][generic][T#1]]: delete: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog-0.ckp (FAILED: java.nio.file.NoSuchFileException: /Users/benjamintrent/Projects/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog-0.ckp)
1> FS 0 [2024-05-06T14:56:44.431864Z; elasticsearch[node_t2][generic][T#1]]: newByteChannel[WRITE, CREATE_NEW]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog-4286841593622259652.ckp
1> FS 0 [2024-05-06T14:56:44.432403Z; elasticsearch[node_t2][generic][T#1]]: copy[REPLACE_EXISTING]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog.ckp -> temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog-4286841593622259652.ckp
1> FS 0 [2024-05-06T14:56:44.432476Z; elasticsearch[node_t2][generic][T#1]]: newFileChannel[WRITE]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog-4286841593622259652.ckp
1> FS 0 [2024-05-06T14:56:44.432676Z; elasticsearch[node_t2][generic][T#1]]: move[ATOMIC_MOVE]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog-4286841593622259652.ckp -> temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog-1.ckp
1> FS 0 [2024-05-06T14:56:44.432925Z; elasticsearch[node_t2][generic][T#1]]: newFileChannel[WRITE, READ, CREATE_NEW]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog-2.tlog
1> FS 0 [2024-05-06T14:56:44.433016Z; elasticsearch[node_t2][generic][T#1]]: newFileChannel[WRITE]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/translog/translog.ckp
1> [2024-05-06T17:56:44,434][TRACE][o.e.i.e.Engine ] [node_t2] [test][0] recovered maximum sequence number [-1] and local checkpoint [-1]
1> FS 0 [2024-05-06T14:56:44.434229Z; elasticsearch[node_t2][generic][T#1]]: createDirectory: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index (FAILED: java.nio.file.FileAlreadyExistsException: /Users/benjamintrent/Projects/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index)
1> FS 0 [2024-05-06T14:56:44.434293Z; elasticsearch[node_t2][generic][T#1]]: newByteChannel[WRITE, CREATE_NEW]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/write.lock (FAILED: java.nio.file.FileAlreadyExistsException: /Users/benjamintrent/Projects/elasticsearch/server/build/testrun/internalClusterTest/temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/write.lock)
1> FS 0 [2024-05-06T14:56:44.434422Z; elasticsearch[node_t2][generic][T#1]]: newFileChannel[CREATE, WRITE]: temp/org.elasticsearch.indices.recovery.IndexRecoveryIT_90F1B792EA6C1C0E-001/tempDir-006/node_t2/indices/BnczjlEdTay5HKfFDDXjnA/0/index/write.lock
1> IFD 1 [2024-05-06T14:56:44.434721Z; elasticsearch[node_t2][generic][T#1]]: init: current segments file is "segments_3"; deletionPolicy=org.elasticsearch.index.engine.CombinedDeletionPolicy@6b2f8909
1> IFD 1 [2024-05-06T14:56:44.434734Z; elasticsearch[node_t2][generic][T#1]]: init: load commit "segments_3"
It doesn’t write ANY point files:
1> IW 1 [2024-05-06T14:56:44.444034Z; main]: 0 ms to write fieldInfos
1> DWPT 1 [2024-05-06T14:56:44.444063Z; main]: new segment has 1 deleted docs
1> DWPT 1 [2024-05-06T14:56:44.444067Z; main]: new segment has 0 soft-deleted docs
1> DWPT 1 [2024-05-06T14:56:44.444077Z; main]: new segment has no vectors; norms; docValues; prox; freqs
1> DWPT 1 [2024-05-06T14:56:44.444083Z; main]: flushedFiles=[_0_ES87BloomFilter_0.bfm, _0.fdm, _0_ES87BloomFilter_0.bfi, _0.nvd, _0.fdx, _0_0.tmd, _0.fdt, _0_Lucene90_0.dvm, _0_Lucene90_0.dvd, _0_0.pos, _0_0.doc, _0_0.tip, _0_0.tim, _0.nvm, _0.fnm]
1> DWPT 1 [2024-05-06T14:56:44.444085Z; main]: flushed codec=Elasticsearch814
1> DWPT 1 [2024-05-06T14:56:44.444385Z; main]: flushed: segment=_0 ramUsed=0 MB newFlushedSize=0.002 MB docs/MB=526.923
OK, I can get this to repeat fairly often locally now.
for run in {1..10}; do ./gradlew ':server:internalClusterTest' --tests "org.elasticsearch.indices.recovery.IndexRecoveryIT.testDoNotInfinitelyWaitForMapping" -Dtests.jvm.argline="-Des.concurrent_search=true" -Dtests.iters=10 ; done || exit 1
To verify I reverted 8652e5cb5799e7c11d891c0ec75a5137dec1203e and it never failed once.
I applied that commit again, and I got a failure pretty quicky.
@original-brownbear something in that commit has broken this test. I wonder if we are not replicating a document deletion due to this tombstone change?
I have confirmed that the order in which the fields are added to the document are key to triggering this failure. I added an additional logging output on a local Lucene build to see if the field infos do indeed thing that they have point values, and apparently, DWPT says they DO have point values, but you can see from the files being flushed, no point field files are actually being flushed.
I have noticed on every failure that occurs:
- DWPT indicates that the "new segment" has 1 deleted document
- It indicates that there are point values, but the flushed files on the rollback obviously do NOT have points (no kdi file)
- During rollback, it seems to indicate there is one deleted doc
- During rollback (unless I am reading this wrong):
rollback: infos=
indicates there are no rollback infos. Does this mean we don't rollback the infos?
The following is seeming like the cause:
- During document write, we hit this line in Lucene: https://github.com/apache/lucene/blob/0aa88910ca9a1032d288996d14203eac4953f2de/lucene/core/src/java/org/apache/lucene/index/IndexingChain.java#L592-L603
- This sets all the field info
- Note, later, because the segNo field is AFTER the text field, we fail during the text field analysis chain and never index the segNo field
- Here is the actual doc being indexed when we hit the failure
1> [2024-05-07T05:53:02,318][INFO ][o.e.i.e.Engine ] [node_t1] [test][0] index into Lucene updated doc fields [[stored,indexed,omitNorms,indexOptions=DOCS<_id:[ff 75 30]>, stored<_source:[7b 22 74 65 73 74 5f 66 69 65 6c 64 22 3a 22 30 22 7d]>, docValuesType=NUMERIC<_version:1>, indexed,tokenized<test_field:0>, SingleValueLongField <_seq_no:0>, docValuesType=NUMERIC<_primary_term:1>]]
- The failure occurs, and we initiate a roll back
- This rollback is occurring during a NRT read or replication
- During the rollback it SEEMS like we do not actually rollback the field infos
Attached is yet another run where it failed. This time its just a single failure for easier reading: 1-copy-output.log.zip
This will be fixed by: https://github.com/apache/lucene/pull/13369
Once we upgrade lucene, I will unmute this test.