OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

[BUG] Data corruption on an actively written primary shard can lead to failing all replicas.

Open itiyamas opened this issue 4 years ago • 14 comments

Describe the bug When data corruption is detected on primary, the engine fails. This in turn results in failing primary shard and closing it. Closing the shard also causes closing and cancelling any ongoing replication operations on primary. One such operation is shard bulk action on replica. Cancelling this operation in turn checks the reason for failure and because the reason is shard unavailability, the primary marks the replica as stale.

Both failing the primary and failing the replica are sent as shard failure actions on master, which in turn serializes them locally only after they are queued on state update task queue.

As a result, a corruption on primary shard can result in corruption of replica shards in certain cases, especially on high throughput writes.

To Reproduce Steps to reproduce the behavior: Please check this commit for the test. You may need to run it 2-3 times to reproduce the behavior.

Expected behavior Corruption on primary should not result in replica being lost and cluster turning red. Instead, replica should have taken over as primary and then a new replica should have been assigned- turning the cluster green after some time.

itiyamas avatar May 30 '21 07:05 itiyamas

Relevant output of the test:

/Users/itiyamas/jdk-14.0.1.jdk/Contents/Home/bin/java -ea -Djava.locale.providers=SPI,COMPAT -Didea.test.cyclic.buffer.size=1048576 -javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=64119:/Applications/IntelliJ IDEA CE.app/Contents/bin -Dfile.encoding=UTF-8 -classpath /Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar:/Applications/IntelliJ IDEA CE.app/Contents/plugins/junit/lib/junit5-rt.jar:/Applications/IntelliJ IDEA CE.app/Contents/plugins/junit/lib/junit-rt.jar:/Users/itiyamas/OpenSearch/server/out/test/classes:/Users/itiyamas/OpenSearch/server/out/test/resources:/Users/itiyamas/OpenSearch/server/out/production/classes:/Users/itiyamas/OpenSearch/server/out/production/resources:/Users/itiyamas/OpenSearch/libs/x-content/out/production/classes:/Users/itiyamas/OpenSearch/test/framework/out/production/classes:/Users/itiyamas/OpenSearch/test/framework/out/production/resources:/Users/itiyamas/OpenSearch/libs/cli/out/production/classes:/Users/itiyamas/OpenSearch/libs/core/out/production/classes:/Users/itiyamas/OpenSearch/libs/secure-sm/out/production/classes:/Users/itiyamas/OpenSearch/libs/geo/out/production/classes:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-core/8.8.2/e039fa90ee900c475bbc0943989ed5dccc556d3e/lucene-core-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-common/8.8.2/fa4b3c1d847f4bb91a3afaa990889dd164883d86/lucene-analyzers-common-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-backward-codecs/8.8.2/f4a0475a1404d7be87e911f5c85f8b913ec9770b/lucene-backward-codecs-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-grouping/8.8.2/207ac15287656b6ce61ea921fefec3070a8dd99d/lucene-grouping-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-highlighter/8.8.2/2674644580108335c08a25fd2e0f0c7dee010700/lucene-highlighter-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-join/8.8.2/6eb956741169fe8a5643fa39b14aef2fc067d865/lucene-join-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-memory/8.8.2/63b6e8e862c898d263b941ffa55fe6c85bcdf2df/lucene-memory-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-misc/8.8.2/a3a916c27c42ff6773c52d0be4045d4e5c00f7f8/lucene-misc-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queries/8.8.2/1f324c3a6e1a7634bbada993b2c014d12d09cb83/lucene-queries-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queryparser/8.8.2/67c6580f8a5187fdb7f9eb2404bdecc4635b3328/lucene-queryparser-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-sandbox/8.8.2/b674deb4957a64db5c8f4afb9127400f4e65c334/lucene-sandbox-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-spatial-extras/8.8.2/227672b3b88c877b2df27608850ca15d834ec35c/lucene-spatial-extras-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-spatial3d/8.8.2/eb77b665a3b3e9ca6b3bb82c3342b843d9729d0d/lucene-spatial3d-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-suggest/8.8.2/36fd5301e7143f0fcb0ba933f96148dc1e132b66/lucene-suggest-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/com.carrotsearch/hppc/0.8.1/ffc7ba8f289428b9508ab484b8001dea944ae603/hppc-0.8.1.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/joda-time/joda-time/2.10.4/8c10bb8815109067ce3c91a8e547b5a52e8a1c1a/joda-time-2.10.4.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/com.tdunning/t-digest/3.2/2ab94758b0276a8a26102adf8d528cf6d0567b9a/t-digest-3.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.hdrhistogram/HdrHistogram/2.1.9/e4631ce165eb400edecfa32e03d3f1be53dee754/HdrHistogram-2.1.9.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.locationtech.spatial4j/spatial4j/0.7/faa8ba85d503da4ab872d17ba8c00da0098ab2f2/spatial4j-0.7.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.locationtech.jts/jts-core/1.15.0/705981b7e25d05a76a3654e597dab6ba423eb79e/jts-core-1.15.0.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.logging.log4j/log4j-api/2.11.1/268f0fe4df3eefe052b57c87ec48517d64fb2a10/log4j-api-2.11.1.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.logging.log4j/log4j-core/2.11.1/592a48674c926b01a9a747c7831bcd82a9e6d6e4/log4j-core-2.11.1.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.elasticsearch/jna/5.5.0/ade077cbb2618a18bfc6c335413b2b7163d97601/jna-5.5.0.jar:/Users/itiyamas/OpenSearch/libs/plugin-classloader/out/production/classes:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.yaml/snakeyaml/1.26/a78a8747147d2c5807683e76ec2b633e95c14fe9/snakeyaml-1.26.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/com.fasterxml.jackson.core/jackson-core/2.11.4/593f7b18bab07a76767f181e2a2336135ce82cc4/jackson-core-2.11.4.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/com.fasterxml.jackson.dataformat/jackson-dataformat-smile/2.11.4/10c1faac0b0bd8545eff02599b48a149202de066/jackson-dataformat-smile-2.11.4.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/com.fasterxml.jackson.dataformat/jackson-dataformat-yaml/2.11.4/ba01014ab0228449be401975b1a7af2f3cdaf1d7/jackson-dataformat-yaml-2.11.4.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/com.fasterxml.jackson.dataformat/jackson-dataformat-cbor/2.11.4/67fa6a00bdc31029bf841ee97d993ef2bb530aa0/jackson-dataformat-cbor-2.11.4.jar:/Users/itiyamas/OpenSearch/libs/nio/out/production/classes:/Users/itiyamas/OpenSearch/client/sniffer/out/production/classes:/Users/itiyamas/OpenSearch/client/rest/out/production/classes:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/com.carrotsearch.randomizedtesting/randomizedtesting-runner/2.7.1/e917f4983144c3b969eb7d3648338ecde5e3ba89/randomizedtesting-runner-2.7.1.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/junit/junit/4.12/2973d150c0dc1fefe998f834810d68f278ea58ec/junit-4.12.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.hamcrest/hamcrest/2.1/9420ba32c29217b54eebd26ff7f9234d31c3fbb2/hamcrest-2.1.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-test-framework/8.8.2/a9b31c79190327e6c87a2987deed506d4969d153/lucene-test-framework-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-codecs/8.8.2/48b8faec209371df05fd47e319b4b1e382484347/lucene-codecs-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/commons-logging/commons-logging/1.1.3/f6f66e966c70a83ffbdb6f17a0919eaf7c8aca7f/commons-logging-1.1.3.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/commons-codec/commons-codec/1.13/3f18e1aa31031d89db6f01ba05d501258ce69d2c/commons-codec-1.13.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.elasticsearch/securemock/1.2/98201d4ad5ac93f6b415ae9172d52b5e7cda490e/securemock-1.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.elasticsearch/mocksocket/1.2/190ad3f42fd6ab7367bae0c0e14d1dc4e4a3c361/mocksocket-1.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/net.sf.jopt-simple/jopt-simple/5.0.2/98cafc6081d5632b61be2c9e60650b64ddbc637c/jopt-simple-5.0.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.httpcomponents/httpclient/4.5.13/e5f6cae5ca7ecaac1ec2827a9e2d65ae2869cada/httpclient-4.5.13.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.httpcomponents/httpcore/4.4.12/21ebaf6d532bc350ba95bd81938fa5f0e511c132/httpcore-4.4.12.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.httpcomponents/httpasyncclient/4.1.4/f3a3240681faae3fa46b573a4c7e50cec9db0d86/httpasyncclient-4.1.4.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.httpcomponents/httpcore-nio/4.4.12/84cd29eca842f31db02987cfedea245af020198b/httpcore-nio-4.4.12.jar com.intellij.rt.junit.JUnitStarter -ideVersion5 -junit4 org.opensearch.index.store.CorruptedFileIT,testCorruptFileDuringIndexing
[2021-05-30T09:20:00,496][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] before test
[2021-05-30T09:20:00,512][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] [CorruptedFileIT#testCorruptFileDuringIndexing]: setting up test
[2021-05-30T09:20:00,582][INFO ][o.o.t.InternalTestCluster] [testCorruptFileDuringIndexing] Setup InternalTestCluster [TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster] with seed [D3C51DF549BF4775] using [0] dedicated masters, [2] (data) nodes and [1] coord only nodes (min_master_nodes are [auto-managed])
[2021-05-30T09:20:01,142][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] version[1.0.0], pid[53657], build[unknown/unknown/unknown], OS[Mac OS X/10.13.6/x86_64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/14.0.1/14.0.1+7]
[2021-05-30T09:20:01,143][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM home [/Users/itiyamas/jdk-14.0.1.jdk/Contents/Home]
[2021-05-30T09:20:01,147][DEPRECATION][o.o.d.n.Node             ] [testCorruptFileDuringIndexing] no-jdk distributions that do not bundle a JDK are deprecated and will be removed in a future release
[2021-05-30T09:20:01,148][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM arguments [-ea, -Djava.locale.providers=SPI,COMPAT, -Didea.test.cyclic.buffer.size=1048576, -javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=64119:/Applications/IntelliJ IDEA CE.app/Contents/bin, -Dfile.encoding=UTF-8]
[2021-05-30T09:20:01,169][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:01,171][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.index.MockEngineFactoryPlugin]
[2021-05-30T09:20:01,171][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.script.MockScriptService$TestPlugin]
[2021-05-30T09:20:01,171][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.InternalSettingsPlugin]
[2021-05-30T09:20:01,171][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockHttpTransport$TestPlugin]
[2021-05-30T09:20:01,172][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockIndexEventListener$TestPlugin]
[2021-05-30T09:20:01,172][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$AssertActionNamePlugin]
[2021-05-30T09:20:01,172][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$TestSeedPlugin]
[2021-05-30T09:20:01,172][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.TestGeoShapeFieldMapperPlugin]
[2021-05-30T09:20:01,172][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.store.MockFSIndexStore$TestPlugin]
[2021-05-30T09:20:01,172][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.transport.MockTransportService$TestPlugin]
[2021-05-30T09:20:01,172][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:01,233][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] using [4] data paths, mounts [[/ (/dev/disk1s1)]], net usable_space [363.8gb], net total_space [465.5gb], types [apfs]
[2021-05-30T09:20:01,234][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] heap size [4gb], compressed ordinary object pointers [true]
[2021-05-30T09:20:01,338][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] node name [node_t0], node ID [ZMqAFWPqSPanRHGlwB6rXw], cluster name [TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster], roles [master, remote_cluster_client, data, ingest]
[2021-05-30T09:20:04,486][INFO ][o.o.d.DiscoveryModule    ] [testCorruptFileDuringIndexing] using discovery type [zen] and seed hosts providers [settings, file]
[2021-05-30T09:20:05,085][WARN ][o.o.g.DanglingIndicesState] [testCorruptFileDuringIndexing] gateway.auto_import_dangling_indices is disabled, dangling indices will not be automatically detected or imported and must be managed manually
[2021-05-30T09:20:05,331][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] initialized
[2021-05-30T09:20:05,334][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] version[1.0.0], pid[53657], build[unknown/unknown/unknown], OS[Mac OS X/10.13.6/x86_64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/14.0.1/14.0.1+7]
[2021-05-30T09:20:05,334][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM home [/Users/itiyamas/jdk-14.0.1.jdk/Contents/Home]
[2021-05-30T09:20:05,334][DEPRECATION][o.o.d.n.Node             ] [testCorruptFileDuringIndexing] no-jdk distributions that do not bundle a JDK are deprecated and will be removed in a future release
[2021-05-30T09:20:05,335][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM arguments [-ea, -Djava.locale.providers=SPI,COMPAT, -Didea.test.cyclic.buffer.size=1048576, -javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=64119:/Applications/IntelliJ IDEA CE.app/Contents/bin, -Dfile.encoding=UTF-8]
[2021-05-30T09:20:05,336][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:05,336][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.index.MockEngineFactoryPlugin]
[2021-05-30T09:20:05,336][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.script.MockScriptService$TestPlugin]
[2021-05-30T09:20:05,336][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.InternalSettingsPlugin]
[2021-05-30T09:20:05,336][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockHttpTransport$TestPlugin]
[2021-05-30T09:20:05,337][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockIndexEventListener$TestPlugin]
[2021-05-30T09:20:05,337][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$AssertActionNamePlugin]
[2021-05-30T09:20:05,337][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$TestSeedPlugin]
[2021-05-30T09:20:05,337][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.TestGeoShapeFieldMapperPlugin]
[2021-05-30T09:20:05,337][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.store.MockFSIndexStore$TestPlugin]
[2021-05-30T09:20:05,337][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.transport.MockTransportService$TestPlugin]
[2021-05-30T09:20:05,337][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:05,355][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] using [4] data paths, mounts [[/ (/dev/disk1s1)]], net usable_space [363.8gb], net total_space [465.5gb], types [apfs]
[2021-05-30T09:20:05,355][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] heap size [4gb], compressed ordinary object pointers [true]
[2021-05-30T09:20:05,367][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] node name [node_t1], node ID [OvzlIeulSuyDzstoCsMgsw], cluster name [TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster], roles [master, remote_cluster_client, data, ingest]
[2021-05-30T09:20:05,399][INFO ][o.o.d.DiscoveryModule    ] [testCorruptFileDuringIndexing] using discovery type [zen] and seed hosts providers [settings, file]
[2021-05-30T09:20:05,488][WARN ][o.o.g.DanglingIndicesState] [testCorruptFileDuringIndexing] gateway.auto_import_dangling_indices is disabled, dangling indices will not be automatically detected or imported and must be managed manually
[2021-05-30T09:20:05,630][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] initialized
[2021-05-30T09:20:05,633][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] version[1.0.0], pid[53657], build[unknown/unknown/unknown], OS[Mac OS X/10.13.6/x86_64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/14.0.1/14.0.1+7]
[2021-05-30T09:20:05,633][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM home [/Users/itiyamas/jdk-14.0.1.jdk/Contents/Home]
[2021-05-30T09:20:05,634][DEPRECATION][o.o.d.n.Node             ] [testCorruptFileDuringIndexing] no-jdk distributions that do not bundle a JDK are deprecated and will be removed in a future release
[2021-05-30T09:20:05,634][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM arguments [-ea, -Djava.locale.providers=SPI,COMPAT, -Didea.test.cyclic.buffer.size=1048576, -javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=64119:/Applications/IntelliJ IDEA CE.app/Contents/bin, -Dfile.encoding=UTF-8]
[2021-05-30T09:20:05,635][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:05,635][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.index.MockEngineFactoryPlugin]
[2021-05-30T09:20:05,635][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.script.MockScriptService$TestPlugin]
[2021-05-30T09:20:05,635][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.InternalSettingsPlugin]
[2021-05-30T09:20:05,635][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockHttpTransport$TestPlugin]
[2021-05-30T09:20:05,635][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockIndexEventListener$TestPlugin]
[2021-05-30T09:20:05,635][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$AssertActionNamePlugin]
[2021-05-30T09:20:05,636][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$TestSeedPlugin]
[2021-05-30T09:20:05,636][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.TestGeoShapeFieldMapperPlugin]
[2021-05-30T09:20:05,636][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.store.MockFSIndexStore$TestPlugin]
[2021-05-30T09:20:05,636][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.transport.MockTransportService$TestPlugin]
[2021-05-30T09:20:05,636][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:05,669][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] using [4] data paths, mounts [[/ (/dev/disk1s1)]], net usable_space [363.8gb], net total_space [465.5gb], types [apfs]
[2021-05-30T09:20:05,669][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] heap size [4gb], compressed ordinary object pointers [true]
[2021-05-30T09:20:05,687][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] node name [node_t2], node ID [wecIupjPS9O5h3wX52oGkg], cluster name [TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster], roles []
[2021-05-30T09:20:05,744][INFO ][o.o.d.DiscoveryModule    ] [testCorruptFileDuringIndexing] using discovery type [zen] and seed hosts providers [settings, file]
[2021-05-30T09:20:05,860][WARN ][o.o.g.DanglingIndicesState] [testCorruptFileDuringIndexing] gateway.auto_import_dangling_indices is disabled, dangling indices will not be automatically detected or imported and must be managed manually
[2021-05-30T09:20:06,011][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] initialized
[2021-05-30T09:20:06,018][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#1]]] starting ...
[2021-05-30T09:20:06,022][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#2]]] starting ...
[2021-05-30T09:20:06,022][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#3]]] starting ...
[2021-05-30T09:20:06,173][INFO ][o.o.t.TransportService   ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#1]]] publish_address {127.0.0.1:64137}, bound_addresses {[::1]:64134}, {127.0.0.1:64137}
[2021-05-30T09:20:06,173][INFO ][o.o.t.TransportService   ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#2]]] publish_address {127.0.0.1:64135}, bound_addresses {[::1]:64133}, {127.0.0.1:64135}
[2021-05-30T09:20:06,174][INFO ][o.o.t.TransportService   ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#3]]] publish_address {127.0.0.1:64136}, bound_addresses {[::1]:64132}, {127.0.0.1:64136}
[2021-05-30T09:20:07,402][INFO ][o.o.c.c.Coordinator      ] [node_t1] setting initial configuration to VotingConfiguration{OvzlIeulSuyDzstoCsMgsw,ZMqAFWPqSPanRHGlwB6rXw}
[2021-05-30T09:20:08,043][INFO ][o.o.c.s.MasterService    ] [node_t1] elected-as-master ([2] nodes joined)[{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr} elect leader, {node_t0}{ZMqAFWPqSPanRHGlwB6rXw}{efha4d-ATBa2WSnhE_u1Nw}{127.0.0.1}{127.0.0.1:64137}{dimr} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 1, version: 1, delta: master node changed {previous [], current [{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}]}, added {{node_t0}{ZMqAFWPqSPanRHGlwB6rXw}{efha4d-ATBa2WSnhE_u1Nw}{127.0.0.1}{127.0.0.1:64137}{dimr}}
[2021-05-30T09:20:08,345][INFO ][o.o.c.c.CoordinationState] [node_t1] cluster UUID set to [rho477_IQ6iNVM_6MoI8Ng]
[2021-05-30T09:20:08,345][INFO ][o.o.c.c.CoordinationState] [node_t0] cluster UUID set to [rho477_IQ6iNVM_6MoI8Ng]
[2021-05-30T09:20:08,622][INFO ][o.o.c.s.ClusterApplierService] [node_t0] master node changed {previous [], current [{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}]}, added {{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}}, term: 1, version: 1, reason: ApplyCommitRequest{term=1, version=1, sourceNode={node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}}
[2021-05-30T09:20:08,636][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#1]]] started
[2021-05-30T09:20:08,642][INFO ][o.o.c.s.ClusterApplierService] [node_t1] master node changed {previous [], current [{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}]}, added {{node_t0}{ZMqAFWPqSPanRHGlwB6rXw}{efha4d-ATBa2WSnhE_u1Nw}{127.0.0.1}{127.0.0.1:64137}{dimr}}, term: 1, version: 1, reason: Publication{term=1, version=1}
[2021-05-30T09:20:08,658][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#2]]] started
[2021-05-30T09:20:08,680][INFO ][o.o.c.s.MasterService    ] [node_t1] node-join[{node_t2}{wecIupjPS9O5h3wX52oGkg}{FVSHxy-dT5uBWnWOD_bcTA}{127.0.0.1}{127.0.0.1:64136} join existing leader], term: 1, version: 2, delta: added {{node_t2}{wecIupjPS9O5h3wX52oGkg}{FVSHxy-dT5uBWnWOD_bcTA}{127.0.0.1}{127.0.0.1:64136}}
[2021-05-30T09:20:08,740][INFO ][o.o.c.s.ClusterApplierService] [node_t0] added {{node_t2}{wecIupjPS9O5h3wX52oGkg}{FVSHxy-dT5uBWnWOD_bcTA}{127.0.0.1}{127.0.0.1:64136}}, term: 1, version: 2, reason: ApplyCommitRequest{term=1, version=2, sourceNode={node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}}
[2021-05-30T09:20:08,740][INFO ][o.o.c.s.ClusterApplierService] [node_t2] master node changed {previous [], current [{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}]}, added {{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr},{node_t0}{ZMqAFWPqSPanRHGlwB6rXw}{efha4d-ATBa2WSnhE_u1Nw}{127.0.0.1}{127.0.0.1:64137}{dimr}}, term: 1, version: 2, reason: ApplyCommitRequest{term=1, version=2, sourceNode={node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}}
[2021-05-30T09:20:08,741][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#3]]] started
[2021-05-30T09:20:08,754][INFO ][o.o.c.s.ClusterApplierService] [node_t1] added {{node_t2}{wecIupjPS9O5h3wX52oGkg}{FVSHxy-dT5uBWnWOD_bcTA}{127.0.0.1}{127.0.0.1:64136}}, term: 1, version: 2, reason: Publication{term=1, version=2}
[2021-05-30T09:20:08,763][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:08,763][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:09,474][INFO ][o.o.g.GatewayService     ] [node_t1] recovered [0] indices into cluster_state
[2021-05-30T09:20:09,555][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:09,555][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:09,605][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:09,605][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:10,532][INFO ][o.o.c.m.MetadataIndexTemplateService] [node_t1] adding template [random_index_template] for index patterns [*]
[2021-05-30T09:20:10,813][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] [CorruptedFileIT#testCorruptFileDuringIndexing]: all set up test
[2021-05-30T09:20:10,814][INFO ][o.o.t.InternalTestCluster] [testCorruptFileDuringIndexing] increasing cluster size from 2 to 3
[2021-05-30T09:20:10,819][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] version[1.0.0], pid[53657], build[unknown/unknown/unknown], OS[Mac OS X/10.13.6/x86_64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/14.0.1/14.0.1+7]
[2021-05-30T09:20:10,819][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM home [/Users/itiyamas/jdk-14.0.1.jdk/Contents/Home]
[2021-05-30T09:20:10,820][DEPRECATION][o.o.d.n.Node             ] [testCorruptFileDuringIndexing] no-jdk distributions that do not bundle a JDK are deprecated and will be removed in a future release
[2021-05-30T09:20:10,820][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM arguments [-ea, -Djava.locale.providers=SPI,COMPAT, -Didea.test.cyclic.buffer.size=1048576, -javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=64119:/Applications/IntelliJ IDEA CE.app/Contents/bin, -Dfile.encoding=UTF-8]
[2021-05-30T09:20:10,821][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:10,821][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.index.MockEngineFactoryPlugin]
[2021-05-30T09:20:10,821][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.script.MockScriptService$TestPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.InternalSettingsPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockHttpTransport$TestPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockIndexEventListener$TestPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$AssertActionNamePlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$TestSeedPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.TestGeoShapeFieldMapperPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.store.MockFSIndexStore$TestPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.transport.MockTransportService$TestPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:10,841][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] using [4] data paths, mounts [[/ (/dev/disk1s1)]], net usable_space [363.8gb], net total_space [465.5gb], types [apfs]
[2021-05-30T09:20:10,841][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] heap size [4gb], compressed ordinary object pointers [true]
[2021-05-30T09:20:10,861][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] node name [node_t3], node ID [YNZFPQcdS26B1MS-Npd20A], cluster name [TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster], roles [master, remote_cluster_client, data, ingest]
[2021-05-30T09:20:10,968][INFO ][o.o.d.DiscoveryModule    ] [testCorruptFileDuringIndexing] using discovery type [zen] and seed hosts providers [settings, file]
[2021-05-30T09:20:11,037][WARN ][o.o.g.DanglingIndicesState] [testCorruptFileDuringIndexing] gateway.auto_import_dangling_indices is disabled, dangling indices will not be automatically detected or imported and must be managed manually
[2021-05-30T09:20:11,062][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] initialized
[2021-05-30T09:20:11,090][DEPRECATION][o.o.d.c.s.Settings       ] [node_t1] [discovery.zen.minimum_master_nodes] setting was deprecated in OpenSearch and will be removed in a future release! See the breaking changes documentation for the next major version.
[2021-05-30T09:20:11,091][DEPRECATION][o.o.d.c.s.Settings       ] [node_t1] [discovery.zen.minimum_master_nodes] setting was deprecated in OpenSearch and will be removed in a future release! See the breaking changes documentation for the next major version.
[2021-05-30T09:20:11,179][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#4]]] starting ...
[2021-05-30T09:20:11,186][INFO ][o.o.t.TransportService   ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#4]]] publish_address {127.0.0.1:64181}, bound_addresses {[::1]:64180}, {127.0.0.1:64181}
[2021-05-30T09:20:11,835][INFO ][o.o.c.s.MasterService    ] [node_t1] node-join[{node_t3}{YNZFPQcdS26B1MS-Npd20A}{0fI2zIpxSj-iC0Kj2S98cw}{127.0.0.1}{127.0.0.1:64181}{dimr} join existing leader], term: 1, version: 8, delta: added {{node_t3}{YNZFPQcdS26B1MS-Npd20A}{0fI2zIpxSj-iC0Kj2S98cw}{127.0.0.1}{127.0.0.1:64181}{dimr}}
[2021-05-30T09:20:11,870][INFO ][o.o.c.s.ClusterApplierService] [node_t2] added {{node_t3}{YNZFPQcdS26B1MS-Npd20A}{0fI2zIpxSj-iC0Kj2S98cw}{127.0.0.1}{127.0.0.1:64181}{dimr}}, term: 1, version: 8, reason: ApplyCommitRequest{term=1, version=8, sourceNode={node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}}
[2021-05-30T09:20:11,880][INFO ][o.o.c.s.ClusterApplierService] [node_t0] added {{node_t3}{YNZFPQcdS26B1MS-Npd20A}{0fI2zIpxSj-iC0Kj2S98cw}{127.0.0.1}{127.0.0.1:64181}{dimr}}, term: 1, version: 8, reason: ApplyCommitRequest{term=1, version=8, sourceNode={node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}}
[2021-05-30T09:20:12,214][INFO ][o.o.c.s.ClusterApplierService] [node_t3] master node changed {previous [], current [{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}]}, added {{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr},{node_t2}{wecIupjPS9O5h3wX52oGkg}{FVSHxy-dT5uBWnWOD_bcTA}{127.0.0.1}{127.0.0.1:64136},{node_t0}{ZMqAFWPqSPanRHGlwB6rXw}{efha4d-ATBa2WSnhE_u1Nw}{127.0.0.1}{127.0.0.1:64137}{dimr}}, term: 1, version: 8, reason: ApplyCommitRequest{term=1, version=8, sourceNode={node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}}
[2021-05-30T09:20:12,223][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#4]]] started
[2021-05-30T09:20:12,223][INFO ][o.o.c.s.ClusterApplierService] [node_t1] added {{node_t3}{YNZFPQcdS26B1MS-Npd20A}{0fI2zIpxSj-iC0Kj2S98cw}{127.0.0.1}{127.0.0.1:64181}{dimr}}, term: 1, version: 8, reason: Publication{term=1, version=8}
[2021-05-30T09:20:12,230][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] --> cluster has [3] data nodes, corrupted primary will be overwritten
[2021-05-30T09:20:12,235][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:12,236][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:13,462][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:13,474][INFO ][o.o.c.m.MetadataCreateIndexService] [node_t1] [test] creating index, cause [api], templates [random_index_template], shards [1]/[1]
[2021-05-30T09:20:14,116][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:14,179][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:15,409][WARN ][o.o.i.IndexSettings      ] [node_t0] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:15,480][WARN ][o.o.i.IndexSettings      ] [node_t0] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:16,057][INFO ][o.o.c.r.a.AllocationService] [node_t1] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[test][0]]]).
[2021-05-30T09:20:16,465][WARN ][o.o.i.IndexSettings      ] [node_t3] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:16,494][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:17,062][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] Index [1] docs async: [true] bulk: [false]
[2021-05-30T09:20:17,235][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:17,246][INFO ][o.o.c.m.MetadataMappingService] [node_t1] [test/Wi1zu6UjRx6JefElfgQ16g] create_mapping [type]
[2021-05-30T09:20:26,218][INFO ][o.o.t.CorruptionUtils    ] [testCorruptFileDuringIndexing] Corrupting file --  flipping at position 153 from 69 to 6a file: segments_4
[2021-05-30T09:20:26,219][INFO ][o.o.t.CorruptionUtils    ] [testCorruptFileDuringIndexing] Checksum before: [2425222970] after: [3359768840] checksum value after corruption: 2425222970] file: segments_4 length: 314
[2021-05-30T09:20:26,220][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] --> [test][0], node[OvzlIeulSuyDzstoCsMgsw], [P], s[STARTED], a[id=mFsTIe3PQx2-7bc6GTEKRQ] corrupted
[2021-05-30T09:20:26,220][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] -->  creating repository
[2021-05-30T09:20:26,228][INFO ][o.o.r.RepositoriesService] [node_t1] put repository [test-repo]
[2021-05-30T09:20:26,484][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] --> snapshot
[2021-05-30T09:20:26,487][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:26,884][INFO ][o.o.t.BackgroundIndexer  ] [testCorruptFileDuringIndexing] --> creating 5 indexing threads (auto start: [true], numOfDocs: [200])
[2021-05-30T09:20:26,887][INFO ][o.o.t.BackgroundIndexer  ] [[Thread-5]] **** starting indexing thread 4
[2021-05-30T09:20:26,888][INFO ][o.o.t.BackgroundIndexer  ] [[Thread-2]] **** starting indexing thread 1
[2021-05-30T09:20:26,887][INFO ][o.o.t.BackgroundIndexer  ] [[Thread-1]] **** starting indexing thread 0
[2021-05-30T09:20:26,887][INFO ][o.o.t.BackgroundIndexer  ] [[Thread-4]] **** starting indexing thread 3
[2021-05-30T09:20:26,888][INFO ][o.o.t.BackgroundIndexer  ] [[Thread-3]] **** starting indexing thread 2
[2021-05-30T09:20:26,919][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:26,932][INFO ][o.o.c.m.MetadataMappingService] [node_t1] [test/Wi1zu6UjRx6JefElfgQ16g] update_mapping [type]
[2021-05-30T09:20:27,240][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:35,226][INFO ][o.o.s.SnapshotsService   ] [node_t1] snapshot [test-repo:test-snap/6bWw79oRRNKKfanK04ZsNA] started
[2021-05-30T09:20:35,264][WARN ][o.o.i.e.Engine           ] [node_t1] [test][0] failed engine [lucene commit failed]
org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=908df73a actual=8ee71f2a (resource=BufferedChecksumIndexInput(MockIndexInputWrapper(NIOFSIndexInput(path="/private/var/folders/wq/7rqhf85j75516cc7fxl77_hdkwz2y1/T/org.opensearch.index.store.CorruptedFileIT_6F82F7463B75BF73-001/tempDir-002/node_t1/d0/nodes/0/indices/Wi1zu6UjRx6JefElfgQ16g/0/index/segments_4"))))
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:419) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:448) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:434) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.CombinedDeletionPolicy.getDocCountOfCommit(CombinedDeletionPolicy.java:144) ~[classes/:?]
	at org.opensearch.index.engine.CombinedDeletionPolicy.onCommit(CombinedDeletionPolicy.java:116) ~[classes/:?]
	at org.apache.lucene.index.IndexFileDeleter.checkpoint(IndexFileDeleter.java:524) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:3823) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3779) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3729) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.InternalEngine.commitIndexWriter(InternalEngine.java:2537) [classes/:?]
	at org.opensearch.index.engine.InternalEngine.flush(InternalEngine.java:1868) [classes/:?]
	at org.opensearch.index.engine.InternalEngine.acquireLastIndexCommit(InternalEngine.java:2108) [classes/:?]
	at org.opensearch.index.shard.IndexShard.acquireLastIndexCommit(IndexShard.java:1191) [classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.snapshot(SnapshotShardsService.java:355) [classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.lambda$startNewShards$1(SnapshotShardsService.java:273) [classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) [classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
[2021-05-30T09:20:35,277][WARN ][o.o.i.e.Engine           ] [node_t1] [test][0] tried to fail engine but engine is already failed. ignoring. [corrupt file (source: [flush])]
org.opensearch.index.engine.FlushFailedEngineException: Flush failed
	at org.opensearch.index.engine.InternalEngine.flush(InternalEngine.java:1882) [classes/:?]
	at org.opensearch.index.engine.InternalEngine.acquireLastIndexCommit(InternalEngine.java:2108) [classes/:?]
	at org.opensearch.index.shard.IndexShard.acquireLastIndexCommit(IndexShard.java:1191) [classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.snapshot(SnapshotShardsService.java:355) [classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.lambda$startNewShards$1(SnapshotShardsService.java:273) [classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) [classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
Caused by: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=908df73a actual=8ee71f2a (resource=BufferedChecksumIndexInput(MockIndexInputWrapper(NIOFSIndexInput(path="/private/var/folders/wq/7rqhf85j75516cc7fxl77_hdkwz2y1/T/org.opensearch.index.store.CorruptedFileIT_6F82F7463B75BF73-001/tempDir-002/node_t1/d0/nodes/0/indices/Wi1zu6UjRx6JefElfgQ16g/0/index/segments_4"))))
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:419) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:448) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:434) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.CombinedDeletionPolicy.getDocCountOfCommit(CombinedDeletionPolicy.java:144) ~[classes/:?]
	at org.opensearch.index.engine.CombinedDeletionPolicy.onCommit(CombinedDeletionPolicy.java:116) ~[classes/:?]
	at org.apache.lucene.index.IndexFileDeleter.checkpoint(IndexFileDeleter.java:524) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:3823) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3779) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3729) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.InternalEngine.commitIndexWriter(InternalEngine.java:2537) ~[classes/:?]
	at org.opensearch.index.engine.InternalEngine.flush(InternalEngine.java:1868) [classes/:?]
	... 8 more
[2021-05-30T09:20:35,279][WARN ][o.o.s.SnapshotShardsService] [node_t1] [[test][0]][test-repo:test-snap/6bWw79oRRNKKfanK04ZsNA] failed to snapshot shard
org.opensearch.index.engine.FlushFailedEngineException: Flush failed
	at org.opensearch.index.engine.InternalEngine.flush(InternalEngine.java:1882) ~[classes/:?]
	at org.opensearch.index.engine.InternalEngine.acquireLastIndexCommit(InternalEngine.java:2108) ~[classes/:?]
	at org.opensearch.index.shard.IndexShard.acquireLastIndexCommit(IndexShard.java:1191) ~[classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.snapshot(SnapshotShardsService.java:355) [classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.lambda$startNewShards$1(SnapshotShardsService.java:273) [classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) [classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
Caused by: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=908df73a actual=8ee71f2a (resource=BufferedChecksumIndexInput(MockIndexInputWrapper(NIOFSIndexInput(path="/private/var/folders/wq/7rqhf85j75516cc7fxl77_hdkwz2y1/T/org.opensearch.index.store.CorruptedFileIT_6F82F7463B75BF73-001/tempDir-002/node_t1/d0/nodes/0/indices/Wi1zu6UjRx6JefElfgQ16g/0/index/segments_4"))))
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:419) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:448) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:434) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.CombinedDeletionPolicy.getDocCountOfCommit(CombinedDeletionPolicy.java:144) ~[classes/:?]
	at org.opensearch.index.engine.CombinedDeletionPolicy.onCommit(CombinedDeletionPolicy.java:116) ~[classes/:?]
	at org.apache.lucene.index.IndexFileDeleter.checkpoint(IndexFileDeleter.java:524) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:3823) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3779) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3729) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.InternalEngine.commitIndexWriter(InternalEngine.java:2537) ~[classes/:?]
	at org.opensearch.index.engine.InternalEngine.flush(InternalEngine.java:1868) ~[classes/:?]
	... 8 more
[2021-05-30T09:20:35,283][WARN ][o.o.i.c.IndicesClusterStateService] [node_t1] [test][0] marking and sending shard failed due to [shard failure, reason [lucene commit failed]]
org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=908df73a actual=8ee71f2a (resource=BufferedChecksumIndexInput(MockIndexInputWrapper(NIOFSIndexInput(path="/private/var/folders/wq/7rqhf85j75516cc7fxl77_hdkwz2y1/T/org.opensearch.index.store.CorruptedFileIT_6F82F7463B75BF73-001/tempDir-002/node_t1/d0/nodes/0/indices/Wi1zu6UjRx6JefElfgQ16g/0/index/segments_4"))))
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:419) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:448) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:434) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.CombinedDeletionPolicy.getDocCountOfCommit(CombinedDeletionPolicy.java:144) ~[classes/:?]
	at org.opensearch.index.engine.CombinedDeletionPolicy.onCommit(CombinedDeletionPolicy.java:116) ~[classes/:?]
	at org.apache.lucene.index.IndexFileDeleter.checkpoint(IndexFileDeleter.java:524) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:3823) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3779) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3729) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.InternalEngine.commitIndexWriter(InternalEngine.java:2537) ~[classes/:?]
	at org.opensearch.index.engine.InternalEngine.flush(InternalEngine.java:1868) ~[classes/:?]
	at org.opensearch.index.engine.InternalEngine.acquireLastIndexCommit(InternalEngine.java:2108) ~[classes/:?]
	at org.opensearch.index.shard.IndexShard.acquireLastIndexCommit(IndexShard.java:1191) ~[classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.snapshot(SnapshotShardsService.java:355) ~[classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.lambda$startNewShards$1(SnapshotShardsService.java:273) ~[classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) [classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
[2021-05-30T09:20:35,291][WARN ][o.o.c.r.a.AllocationService] [node_t1] failing shard [failed shard, shard [test][0], node[OvzlIeulSuyDzstoCsMgsw], [P], s[STARTED], a[id=mFsTIe3PQx2-7bc6GTEKRQ], message [shard failure, reason [lucene commit failed]], failure [CorruptIndexException[checksum failed (hardware problem?) : expected=908df73a actual=8ee71f2a (resource=BufferedChecksumIndexInput(MockIndexInputWrapper(NIOFSIndexInput(path="/private/var/folders/wq/7rqhf85j75516cc7fxl77_hdkwz2y1/T/org.opensearch.index.store.CorruptedFileIT_6F82F7463B75BF73-001/tempDir-002/node_t1/d0/nodes/0/indices/Wi1zu6UjRx6JefElfgQ16g/0/index/segments_4"))))]], markAsStale [true]]
org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=908df73a actual=8ee71f2a (resource=BufferedChecksumIndexInput(MockIndexInputWrapper(NIOFSIndexInput(path="/private/var/folders/wq/7rqhf85j75516cc7fxl77_hdkwz2y1/T/org.opensearch.index.store.CorruptedFileIT_6F82F7463B75BF73-001/tempDir-002/node_t1/d0/nodes/0/indices/Wi1zu6UjRx6JefElfgQ16g/0/index/segments_4"))))
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:419) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:448) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:434) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.CombinedDeletionPolicy.getDocCountOfCommit(CombinedDeletionPolicy.java:144) ~[classes/:?]
	at org.opensearch.index.engine.CombinedDeletionPolicy.onCommit(CombinedDeletionPolicy.java:116) ~[classes/:?]
	at org.apache.lucene.index.IndexFileDeleter.checkpoint(IndexFileDeleter.java:524) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:3823) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3779) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3729) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.InternalEngine.commitIndexWriter(InternalEngine.java:2537) ~[classes/:?]
	at org.opensearch.index.engine.InternalEngine.flush(InternalEngine.java:1868) ~[classes/:?]
	at org.opensearch.index.engine.InternalEngine.acquireLastIndexCommit(InternalEngine.java:2108) ~[classes/:?]
	at org.opensearch.index.shard.IndexShard.acquireLastIndexCommit(IndexShard.java:1191) ~[classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.snapshot(SnapshotShardsService.java:355) ~[classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.lambda$startNewShards$1(SnapshotShardsService.java:273) ~[classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) ~[classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
[2021-05-30T09:20:35,294][WARN ][o.o.c.r.a.AllocationService] [node_t1] failing shard [failed shard, shard [test][0], node[ZMqAFWPqSPanRHGlwB6rXw], [R], s[STARTED], a[id=6VnoJTbVSDKM7EAKbdlqNg], message [failed to perform indices:data/write/bulk[s] on replica [test][0], node[ZMqAFWPqSPanRHGlwB6rXw], [R], s[STARTED], a[id=6VnoJTbVSDKM7EAKbdlqNg]], failure [IndexShardClosedException[CurrentState[CLOSED] Primary closed.]], markAsStale [true]]
org.opensearch.index.shard.IndexShardClosedException: CurrentState[CLOSED] Primary closed.
	at org.opensearch.action.support.replication.PendingReplicationActions.lambda$cancelActions$0(PendingReplicationActions.java:125) ~[classes/:?]
	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) ~[?:?]
	at java.util.concurrent.ConcurrentHashMap$KeySpliterator.forEachRemaining(ConcurrentHashMap.java:3573) ~[?:?]
	at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658) ~[?:?]
	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274) ~[?:?]
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1624) ~[?:?]
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?]
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[?:?]
	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[?:?]
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) ~[?:?]
	at org.opensearch.action.support.replication.PendingReplicationActions.lambda$cancelActions$1(PendingReplicationActions.java:125) ~[classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) ~[classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
[2021-05-30T09:20:35,302][INFO ][o.o.c.r.a.AllocationService] [node_t1] Cluster health status changed from [GREEN] to [RED] (reason: [shards failed [[test][0], [test][0]]]).
[2021-05-30T09:20:36,379][INFO ][o.o.s.SnapshotsService   ] [node_t1] snapshot [test-repo:test-snap/6bWw79oRRNKKfanK04ZsNA] completed with state [PARTIAL]
[2021-05-30T09:20:36,382][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] --> snapshot terminated with state PARTIAL
[2021-05-30T09:21:31,347][WARN ][o.o.t.BackgroundIndexer  ] [[Thread-3]] **** failed indexing thread 2 on doc id 26
org.opensearch.action.UnavailableShardsException: [test][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[test][0]] containing [index {[test][_doc][jC0lvHkBUZSfq7VjSxlo], source[{"test":"value26","text":" 푌랞럽푁꼭귣잲킾뾵 \uD83C\uDC25\uD83C\uDC1E\uD83C\uDC2B\uD83C\uDC09\uD83C\uDC1F\uD83C\uDC2B\uD83C\uDC1D\uD83C\uDC15\uD83C\uDC1A\uD83C\uDC04 ՅԻվխՂր՗ՈԼ ퟟퟴퟐ퟽ퟛ퟽ힾퟹힽퟗ ᧨᧻ \uD83C\uDD5C\uD83C\uDDE9\uD83C\uDD5D ąŮ \uD802\uDF14","id":26}]}]]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.retryBecauseUnavailable(TransportReplicationAction.java:913) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.doRun(TransportReplicationAction.java:758) ~[classes/:?]
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase$2.onTimeout(TransportReplicationAction.java:873) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:348) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:265) ~[classes/:?]
	at org.opensearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:614) ~[classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) ~[classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?]
	at java.lang.Thread.run(Thread.java:832) ~[?:?]
[2021-05-30T09:21:33,373][WARN ][o.o.t.BackgroundIndexer  ] [[Thread-4]] **** failed indexing thread 3 on doc id 34
org.opensearch.action.UnavailableShardsException: [test][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[test][0]] containing [index {[test][_doc][lC0lvHkBUZSfq7VjUxlH], source[{"test":"value34","text":" \uD834\uDE0D\uD834\uDE0E ⟰⟿ \uD800\uDFC1\uD800\uDFDC\uD800\uDFD6\uD800\uDFBC\uD800\uDFC8\uD800\uDFA1\uD800\uDFB9 ➔❺❤❫➢➓✝➏➳ \uD804\uDCAE\uD804\uDCBF\uD804\uDCCE\uD804\uDC98 ⅊ℱℐ ꩾ \uDB94\uDF93\uDB8B\uDE72\uDB90\uDDB7\uDB9F\uDE9B\uDBAB\uDE1D\uDB81\uDED4\uDB9B\uDFBA\uDB93\uDF67 ⷏ⷊⶑⶂⷎⷃ ֐׹א֯מֿתת׀י \uD800\uDF2B ︬︨︨︫︬︤︦ \uD802\uDE7F\uD802\uDE7E\uD802\uDE68\uD802\uDE7D\uD802\uDE60\uD802\uDE6B\uD802\uDE6D\uD802\uDE6E\uD802\uDE68 ⷪⷸⷶⷯⷡⷭⷮⷻⷦⷨ Ⴥჿ჻ქხႤ ᎹᎠ ⭬ Ἡ \uD80C\uDD24","id":34}]}]]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.retryBecauseUnavailable(TransportReplicationAction.java:913) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.doRun(TransportReplicationAction.java:758) ~[classes/:?]
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase$2.onTimeout(TransportReplicationAction.java:873) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:348) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:265) ~[classes/:?]
	at org.opensearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:614) ~[classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) ~[classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?]
	at java.lang.Thread.run(Thread.java:832) ~[?:?]
[2021-05-30T09:21:33,374][WARN ][o.o.t.BackgroundIndexer  ] [[Thread-1]] **** failed indexing thread 0 on doc id 27
org.opensearch.action.UnavailableShardsException: [test][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[test][0]] containing [index {[test][_doc][jS0lvHkBUZSfq7VjUxkl], source[{"test":"value27","text":" ⥏⥱⤼⤾⥩⥻⤔⥹⤴⤛ ᥰᥲ᥵᥾᥷᥮ᥩ \uD801\uDC1B\uD801\uDC02\uD801\uDC45\uD801\uDC12\uD801\uDC2F\uD801\uDC23 ꨔꨆ ꠦꠤꠏꠡꠐ \uD83C\uDEF9\uD83C\uDEC7\uD83C\uDE14 ۾۶ڝ۩ٱ۞ ⁶ₓⁱ₃₃ₗ ᮿᮔᮦᮢᮍᮦᮞᮔ ɵɭ ⸻⹦⹜⸑⸒⸥⸖⸜⸀ ࿫࿯༵࿌ཊ࿮࿾࿇࿺ཞ ᜉᜆ᜖᜛᜙᜚ᜅ ￸ ꣼꣢꣮ꣲ꣩꣫ꣽ꣦ꣴꣾ ǾƣDŽƹ ᭪ᬋ","id":27}]}]]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.retryBecauseUnavailable(TransportReplicationAction.java:913) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.doRun(TransportReplicationAction.java:758) ~[classes/:?]
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase$2.onTimeout(TransportReplicationAction.java:873) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:348) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:265) ~[classes/:?]
	at org.opensearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:614) ~[classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) ~[classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?]
	at java.lang.Thread.run(Thread.java:832) ~[?:?]
[2021-05-30T09:21:33,378][WARN ][o.o.t.BackgroundIndexer  ] [[Thread-2]] **** failed indexing thread 1 on doc id 28
org.opensearch.action.UnavailableShardsException: [test][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[test][0]] containing [index {[test][_doc][ji0lvHkBUZSfq7VjUxkt], source[{"test":"value28","text":" 牡頕苒鰽滜舢 ㆙㆖㆘㆞ ও২া৳৆য়৹ ʕɽ \uDB40\uDD40\uDB40\uDD2F\uDB40\uDD97\uDB40\uDDC4\uDB40\uDD7C\uDB40\uDD0C\uDB40\uDD93\uDB40\uDD79 ᵴ","id":28}]}]]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.retryBecauseUnavailable(TransportReplicationAction.java:913) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.doRun(TransportReplicationAction.java:758) ~[classes/:?]
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase$2.onTimeout(TransportReplicationAction.java:873) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:348) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:265) ~[classes/:?]
	at org.opensearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:614) ~[classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) ~[classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?]
	at java.lang.Thread.run(Thread.java:832) ~[?:?]
[2021-05-30T09:21:36,389][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] --> Checking RED status
[2021-05-30T09:21:36,391][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] --> Confirmed RED status
[2021-05-30T09:21:36,392][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] [CorruptedFileIT#testCorruptFileDuringIndexing]: cleaning up after test

itiyamas avatar May 30 '21 07:05 itiyamas

You may need to run it 2-3 times to reproduce the behavior.

Is there not a reproducible -Dtests.seed?

nknize avatar Jun 01 '21 17:06 nknize

The issue is reproduced only when a pending replica write is in progress when the corruption happens. The way I achieved it is a bit hacky, I added a few sleeps and parallelized writes to increase the probability of this happening. This is why sometimes you may need to run it multiple times to run into the issue.

itiyamas avatar Jun 02 '21 11:06 itiyamas

Without looking at the code, maybe corrupted data can be cooked ahead of time, or stored data forcefully corrupted by writing some random bytes for the test to be reliable?

dblock avatar Jun 04 '21 14:06 dblock

That's what I did- I have forcefully corrupted the data. The issue is to reproduce the race where the writes are pending on the shard at exactly the same time when the primary fails - again deterministically reproducing the race is not impossible, just that adding sleep was just faster. :)

itiyamas avatar Jun 04 '21 15:06 itiyamas

@itiyamas I was unable to reproduce the issue in a deterministic manner. Do you have any specific idea on how to do so?

adnapibar avatar Sep 13 '21 18:09 adnapibar

Hi @itiyamas, I am closing the issue for now, since there is no clear steps to reproduce it. Feel free to re-open if you have more details to add.

anasalkouz avatar Nov 17 '21 21:11 anasalkouz

@anasalkouz Not sure if this got fixed as a side effect of some other change. I was able to trace through the code and identify what was causing the bug. Are those code paths not relevant anymore? Did you check that? The ones that are shared in the description.

itiyamas avatar Jan 13 '22 20:01 itiyamas

@anasalkouz Please re-open this issue. This is an ongoing issue in one of our clusters. If you know of a fix, I would like to backport it to older versions as well. Also, if you are unable to repro this, I would like to pursue this myself.

itiyama avatar Jul 30 '22 00:07 itiyama

I'm looking into this now

andrross avatar Aug 01 '22 15:08 andrross

I have cherry picked the modified integration test into the latest on the main branch and have verified that this bug is fairly reliably reproducible. It usually takes me between 3 and 6 attempts to see a failure, never taking more than 10 minutes of running to fail. I'm going to keep digging to see if I can come up with any ideas about how to fix this.

andrross avatar Aug 02 '22 00:08 andrross

One of the ways that I can think of solving this:

  1. Primary should never fail the replica when primary itself is cancelling replication operations for its own failure. Currently it has no way to distinguish errors originating from replica vs locally originating from primary itself.
  2. Code wise: When primary cancels the replication action, there should be a way to distinguish it from the errors originating from replica(maybe add source in the exception message- primary/replica and corresponding term). Once we have a way to distinguish the failures, it can be handled gracefully in onFailure or in failShardIfNeeded by checking the source of this failure.

What do you think?

itiyama avatar Aug 03 '22 21:08 itiyama

Thanks @itiyama! Yes, the approach you've describe is exactly what I'm looking into.

andrross avatar Aug 03 '22 23:08 andrross

@itiyama Please take a look at my draft PR. I still need to make the test deterministic, and some of the various sleeps are still in place. The approach I took to create a specific exception type for primary closure and then branch the logic based on that doesn't feel like the most general-purpose way to solve this, but it does appear to fix the issue. Please let me know what you think. I am worried about unintended side effects here (the worst case being that a replica shard that should have been failed is not failed and then promoted to primary and data is lost, but going with a very targeted fix would seem to make that unlikely).

andrross avatar Aug 04 '22 20:08 andrross