lucene icon indicating copy to clipboard operation
lucene copied to clipboard

CorruptIndexException: docs out of order in merge thread

Open swapnilsvaidya opened this issue 8 months ago • 1 comments

Description

We are using OpenSearch 1.2.3 to index our data. We are observing the following CorruptIndexException frequently


org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.index.CorruptIndexException: docs out of order (46 <= 684 ) (resource=RateLimitedIndexOutput(FSIndexOutput(path="/search/nodes/0/indices/A7Yxq0gHSz2ktlIv695d2Q/0/index/_qwcf_Lucene84_0.doc")))

    at org.opensearch.index.engine.InternalEngine$EngineMergeScheduler$2.doRun(InternalEngine.java:2719) [opensearch-1.2.3.jar:1.2.3]

    at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:792) [opensearch-1.2.3.jar:1.2.3]

    at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50) [opensearch-1.2.3.jar:1.2.3]

    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?]

    at java.lang.Thread.run(Unknown Source) [?:?]

Caused by: org.apache.lucene.index.CorruptIndexException: docs out of order (46 <= 684 ) (resource=RateLimitedIndexOutput(FSIndexOutput(path="/search/nodes/0/indices/A7Yxq0gHSz2ktlIv695d2Q/0/index/_qwcf_Lucene84_0.doc")))

    at org.apache.lucene.codecs.lucene84.Lucene84PostingsWriter.startDoc(Lucene84PostingsWriter.java:231) ~[lucene-core-8.10.1.jar:8.10.1 2f24e6a49d48a032df1f12e146612f59141727a9 - mayyasharipova - 2021-10-12 15:13:05]

    at org.apache.lucene.codecs.PushPostingsWriterBase.writeTerm(PushPostingsWriterBase.java:146) ~[lucene-core-8.10.1.jar:8.10.1 2f24e6a49d48a032df1f12e146612f59141727a9 - mayyasharipova - 2021-10-12 15:13:05]

    at org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.write(BlockTreeTermsWriter.java:907) ~[lucene-core-8.10.1.jar:8.10.1 2f24e6a49d48a032df1f12e146612f59141727a9 - mayyasharipova - 2021-10-12 15:13:05]

    at org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTreeTermsWriter.java:318) ~[lucene-core-8.10.1.jar:8.10.1 2f24e6a49d48a032df1f12e146612f59141727a9 - mayyasharipova - 2021-10-12 15:13:05]

    at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:105) ~[lucene-core-8.10.1.jar:8.10.1 2f24e6a49d48a032df1f12e146612f59141727a9 - mayyasharipova - 2021-10-12 15:13:05]

    at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:197) ~[lucene-core-8.10.1.jar:8.10.1 2f24e6a49d48a032df1f12e146612f59141727a9 - mayyasharipova - 2021-10-12 15:13:05]

    at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:244) ~[lucene-core-8.10.1.jar:8.10.1 2f24e6a49d48a032df1f12e146612f59141727a9 - mayyasharipova - 2021-10-12 15:13:05]

    at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139) ~[lucene-core-8.10.1.jar:8.10.1 2f24e6a49d48a032df1f12e146612f59141727a9 - mayyasharipova - 2021-10-12 15:13:05]

    at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4757) ~[lucene-core-8.10.1.jar:8.10.1 2f24e6a49d48a032df1f12e146612f59141727a9 - mayyasharipova - 2021-10-12 15:13:05]

    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4361) ~[lucene-core-8.10.1.jar:8.10.1 2f24e6a49d48a032df1f12e146612f59141727a9 - mayyasharipova - 2021-10-12 15:13:05]

    at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5920) ~[lucene-core-8.10.1.jar:8.10.1 2f24e6a49d48a032df1f12e146612f59141727a9 - mayyasharipova - 2021-10-12 15:13:05]

    at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:626) ~[lucene-core-8.10.1.jar:8.10.1 2f24e6a49d48a032df1f12e146612f59141727a9 - mayyasharipova - 2021-10-12 15:13:05]

    at org.opensearch.index.engine.OpenSearchConcurrentMergeScheduler.doMerge(OpenSearchConcurrentMergeScheduler.java:118) ~[opensearch-1.2.3.jar:1.2.3]

    at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684) ~[lucene-core-8.10.1.jar:8.10.1 2f24e6a49d48a032df1f12e146612f59141727a9 - mayyasharipova - 2021-10-12 15:13:05]

I can see that a similar issue is filed on lucene 2.x version but we are using the recent version and still facing this issue.

I checked the state of the Corrupted index using the CheckIndex utility but it is not reporting any issue with the index.

java -cp lucene-core-8.10.1.jar org.apache.lucene.index.CheckIndex -slow /search/nodes/0/indices/A7Yxq0gHSz2ktlIv695d2Q/0/index/

Opening index @ /search/nodes/0/indices/A7Yxq0gHSz2ktlIv695d2Q/0/index/

0.00% total deletions; 12038 documents; 0 deleteions Segments file=segments_3m numSegments=11 version=8.10.1 id=8jru4jsnlmk6em810fnqva8cw userData={history_uuid=HJhQ41sOTDeVG_Jtb8yT1Q, local_checkpoint=1473044, max_seq_no=1473044, max_unsafe_auto_id_timestamp=-1, min_retained_seq_no=1461070, translog_uuid=TDZJxve2RLCYj-DpRFRAbw} 1 of 11: name=_qp5q maxDoc=5361 version=8.10.1 id=8jru4jsnlmk6em810fnqv8ihn codec=Lucene87 compound=false numFiles=20 size (MB)=21.958 diagnostics = {os=Linux, java.version=11.0.23, os.arch=amd64, java.runtime.version=11.0.23+10-LTS, source=merge, os.version=5.15.146-nn2-server, java.vendor=BellSoft, java.vm.version=11.0.23+10-LTS, lucene.version=8.10.1, mergeMaxNumSegments=-1, mergeFactor=10, timestamp=1717442290779} no deletions test: open reader.........OK [took 0.079 sec] test: check integrity.....OK [took 0.014 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [328 fields] [took 0.000 sec] test: field norms.........OK [80 fields] [took 0.016 sec] test: terms, freq, prox...OK [124918 terms; 2974090 terms/docs pairs; 3758504 tokens] [took 0.824 sec] test: stored fields.......OK [10722 total field count; avg 2.0 fields per doc] [took 0.159 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [246 docvalues fields; 0 BINARY; 4 NUMERIC; 0 SORTED; 84 SORTED_NUMERIC; 158 SORTED_SET] [took 0.407 sec] test: points..............OK [76 fields, 615710 points] [took 0.063 sec] test: check soft deletes..... 2 of 11: name=_qri6 maxDoc=3313 version=8.10.1 id=8jru4jsnlmk6em810fnqv9ceu codec=Lucene87 compound=false numFiles=20 size (MB)=14.201 diagnostics = {os=Linux, java.version=11.0.23, os.arch=amd64, java.runtime.version=11.0.23+10-LTS, source=merge, os.version=5.15.146-nn2-server, java.vendor=BellSoft, java.vm.version=11.0.23+10-LTS, lucene.version=8.10.1, mergeMaxNumSegments=-1, mergeFactor=10, timestamp=1717443246184} no deletions test: open reader.........OK [took 0.013 sec] test: check integrity.....OK [took 0.005 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [328 fields] [took 0.000 sec] test: field norms.........OK [80 fields] [took 0.001 sec] test: terms, freq, prox...OK [104449 terms; 1838566 terms/docs pairs; 2323261 tokens] [took 0.363 sec] test: stored fields.......OK [6626 total field count; avg 2.0 fields per doc] [took 0.075 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [246 docvalues fields; 0 BINARY; 4 NUMERIC; 0 SORTED; 84 SORTED_NUMERIC; 158 SORTED_SET] [took 0.137 sec] test: points..............OK [76 fields, 380766 points] [took 0.025 sec] test: check soft deletes..... 3 of 11: name=_qsvd maxDoc=1907 version=8.10.1 id=8jru4jsnlmk6em810fnqv9tyn codec=Lucene87 compound=false numFiles=20 size (MB)=8.642 diagnostics = {os=Linux, java.version=11.0.23, os.arch=amd64, java.runtime.version=11.0.23+10-LTS, source=merge, os.version=5.15.146-nn2-server, java.vendor=BellSoft, java.vm.version=11.0.23+10-LTS, lucene.version=8.10.1, mergeMaxNumSegments=-1, mergeFactor=10, timestamp=1717443857480} no deletions test: open reader.........OK [took 0.011 sec] test: check integrity.....OK [took 0.003 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [328 fields] [took 0.000 sec] test: field norms.........OK [80 fields] [took 0.000 sec] test: terms, freq, prox...OK [78363 terms; 1058585 terms/docs pairs; 1337263 tokens] [took 0.204 sec] test: stored fields.......OK [3814 total field count; avg 2.0 fields per doc] [took 0.045 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [246 docvalues fields; 0 BINARY; 4 NUMERIC; 0 SORTED; 84 SORTED_NUMERIC; 158 SORTED_SET] [took 0.064 sec] test: points..............OK [76 fields, 218938 points] [took 0.015 sec] test: check soft deletes..... 4 of 11: name=_qtw1 maxDoc=1448 version=8.10.1 id=8jru4jsnlmk6em810fnqva88x codec=Lucene87 compound=false numFiles=20 size (MB)=6.901 diagnostics = {os=Linux, java.version=11.0.23, os.arch=amd64, java.runtime.version=11.0.23+10-LTS, source=merge, os.version=5.15.146-nn2-server, java.vendor=BellSoft, java.vm.version=11.0.23+10-LTS, lucene.version=8.10.1, mergeMaxNumSegments=-1, mergeFactor=10, timestamp=1717444263593} no deletions test: open reader.........OK [took 0.010 sec] test: check integrity.....OK [took 0.002 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [328 fields] [took 0.000 sec] test: field norms.........OK [80 fields] [took 0.000 sec] test: terms, freq, prox...OK [76024 terms; 804015 terms/docs pairs; 1015762 tokens] [took 0.161 sec] test: stored fields.......OK [2896 total field count; avg 2.0 fields per doc] [took 0.033 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [246 docvalues fields; 0 BINARY; 4 NUMERIC; 0 SORTED; 84 SORTED_NUMERIC; 158 SORTED_SET] [took 0.049 sec] test: points..............OK [76 fields, 166420 points] [took 0.023 sec] test: check soft deletes..... 5 of 11: name=_qtw2 maxDoc=1 version=8.10.1 id=8jru4jsnlmk6em810fnqva895 codec=Lucene87 compound=true numFiles=3 size (MB)=0.139 diagnostics = {java.vendor=BellSoft, os=Linux, java.version=11.0.23, java.vm.version=11.0.23+10-LTS, lucene.version=8.10.1, os.arch=amd64, java.runtime.version=11.0.23+10-LTS, source=flush, os.version=5.15.146-nn2-server, timestamp=1717444263988} no deletions test: open reader.........OK [took 0.007 sec] test: check integrity.....OK [took 0.000 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [327 fields] [took 0.000 sec] test: field norms.........OK [80 fields] [took 0.000 sec] test: terms, freq, prox...OK [557 terms; 557 terms/docs pairs; 703 tokens] [took 0.007 sec] test: stored fields.......OK [2 total field count; avg 2.0 fields per doc] [took 0.000 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [245 docvalues fields; 0 BINARY; 3 NUMERIC; 0 SORTED; 84 SORTED_NUMERIC; 158 SORTED_SET] [took 0.002 sec] test: points..............OK [76 fields, 116 points] [took 0.001 sec]

6 of 11: name=_qtw3 maxDoc=1 version=8.10.1 id=8jru4jsnlmk6em810fnqva89g codec=Lucene87 compound=true numFiles=3 size (MB)=0.139 diagnostics = {java.vendor=BellSoft, os=Linux, java.version=11.0.23, java.vm.version=11.0.23+10-LTS, lucene.version=8.10.1, os.arch=amd64, java.runtime.version=11.0.23+10-LTS, source=flush, os.version=5.15.146-nn2-server, timestamp=1717444264168} no deletions test: open reader.........OK [took 0.006 sec] test: check integrity.....OK [took 0.000 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [327 fields] [took 0.000 sec] test: field norms.........OK [80 fields] [took 0.000 sec] test: terms, freq, prox...OK [557 terms; 557 terms/docs pairs; 703 tokens] [took 0.005 sec] test: stored fields.......OK [2 total field count; avg 2.0 fields per doc] [took 0.000 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [245 docvalues fields; 0 BINARY; 3 NUMERIC; 0 SORTED; 84 SORTED_NUMERIC; 158 SORTED_SET] [took 0.003 sec] test: points..............OK [76 fields, 116 points] [took 0.001 sec]

7 of 11: name=_qtw4 maxDoc=1 version=8.10.1 id=8jru4jsnlmk6em810fnqva89p codec=Lucene87 compound=true numFiles=3 size (MB)=0.139 diagnostics = {java.vendor=BellSoft, os=Linux, java.version=11.0.23, java.vm.version=11.0.23+10-LTS, lucene.version=8.10.1, os.arch=amd64, java.runtime.version=11.0.23+10-LTS, source=flush, os.version=5.15.146-nn2-server, timestamp=1717444264260} no deletions test: open reader.........OK [took 0.007 sec] test: check integrity.....OK [took 0.000 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [327 fields] [took 0.000 sec] test: field norms.........OK [80 fields] [took 0.000 sec] test: terms, freq, prox...OK [552 terms; 552 terms/docs pairs; 701 tokens] [took 0.008 sec] test: stored fields.......OK [2 total field count; avg 2.0 fields per doc] [took 0.000 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [245 docvalues fields; 0 BINARY; 3 NUMERIC; 0 SORTED; 84 SORTED_NUMERIC; 158 SORTED_SET] [took 0.002 sec] test: points..............OK [76 fields, 114 points] [took 0.001 sec]

8 of 11: name=_qtw5 maxDoc=2 version=8.10.1 id=8jru4jsnlmk6em810fnqva8ab codec=Lucene87 compound=true numFiles=3 size (MB)=0.15 diagnostics = {java.vendor=BellSoft, os=Linux, java.version=11.0.23, java.vm.version=11.0.23+10-LTS, lucene.version=8.10.1, os.arch=amd64, java.runtime.version=11.0.23+10-LTS, source=flush, os.version=5.15.146-nn2-server, timestamp=1717444264887} no deletions test: open reader.........OK [took 0.006 sec] test: check integrity.....OK [took 0.000 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [327 fields] [took 0.000 sec] test: field norms.........OK [80 fields] [took 0.000 sec] test: terms, freq, prox...OK [659 terms; 1114 terms/docs pairs; 1406 tokens] [took 0.008 sec] test: stored fields.......OK [4 total field count; avg 2.0 fields per doc] [took 0.000 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [245 docvalues fields; 0 BINARY; 3 NUMERIC; 0 SORTED; 84 SORTED_NUMERIC; 158 SORTED_SET] [took 0.002 sec] test: points..............OK [76 fields, 232 points] [took 0.001 sec]

9 of 11: name=_qtw6 maxDoc=2 version=8.10.1 id=8jru4jsnlmk6em810fnqva8az codec=Lucene87 compound=true numFiles=3 size (MB)=0.152 diagnostics = {java.vendor=BellSoft, os=Linux, java.version=11.0.23, java.vm.version=11.0.23+10-LTS, lucene.version=8.10.1, os.arch=amd64, java.runtime.version=11.0.23+10-LTS, source=flush, os.version=5.15.146-nn2-server, timestamp=1717444265208} no deletions test: open reader.........OK [took 0.006 sec] test: check integrity.....OK [took 0.000 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [327 fields] [took 0.000 sec] test: field norms.........OK [80 fields] [took 0.000 sec] test: terms, freq, prox...OK [716 terms; 1114 terms/docs pairs; 1402 tokens] [took 0.007 sec] test: stored fields.......OK [4 total field count; avg 2.0 fields per doc] [took 0.000 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [245 docvalues fields; 0 BINARY; 3 NUMERIC; 0 SORTED; 84 SORTED_NUMERIC; 158 SORTED_SET] [took 0.002 sec] test: points..............OK [76 fields, 228 points] [took 0.001 sec]

10 of 11: name=_qtw7 maxDoc=1 version=8.10.1 id=8jru4jsnlmk6em810fnqva8bm codec=Lucene87 compound=true numFiles=3 size (MB)=0.139 diagnostics = {java.vendor=BellSoft, os=Linux, java.version=11.0.23, java.vm.version=11.0.23+10-LTS, lucene.version=8.10.1, os.arch=amd64, java.runtime.version=11.0.23+10-LTS, source=flush, os.version=5.15.146-nn2-server, timestamp=1717444265291} no deletions test: open reader.........OK [took 0.006 sec] test: check integrity.....OK [took 0.000 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [327 fields] [took 0.000 sec] test: field norms.........OK [80 fields] [took 0.000 sec] test: terms, freq, prox...OK [555 terms; 555 terms/docs pairs; 701 tokens] [took 0.006 sec] test: stored fields.......OK [2 total field count; avg 2.0 fields per doc] [took 0.000 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [245 docvalues fields; 0 BINARY; 3 NUMERIC; 0 SORTED; 84 SORTED_NUMERIC; 158 SORTED_SET] [took 0.002 sec] test: points..............OK [76 fields, 114 points] [took 0.001 sec]

11 of 11: name=_qtw8 maxDoc=1 version=8.10.1 id=8jru4jsnlmk6em810fnqva8cd codec=Lucene87 compound=true numFiles=3 size (MB)=0.139 diagnostics = {java.vendor=BellSoft, os=Linux, java.version=11.0.23, java.vm.version=11.0.23+10-LTS, lucene.version=8.10.1, os.arch=amd64, java.runtime.version=11.0.23+10-LTS, source=flush, os.version=5.15.146-nn2-server, timestamp=1717444265405} no deletions test: open reader.........OK [took 0.006 sec] test: check integrity.....OK [took 0.000 sec] test: check live docs.....OK [took 0.000 sec] test: field infos.........OK [327 fields] [took 0.000 sec] test: field norms.........OK [80 fields] [took 0.000 sec] test: terms, freq, prox...OK [554 terms; 554 terms/docs pairs; 703 tokens] [took 0.006 sec] test: stored fields.......OK [2 total field count; avg 2.0 fields per doc] [took 0.000 sec] test: term vectors........OK [0 total term vector count; avg 0.0 term/freq vector fields per doc] [took 0.000 sec] test: docvalues...........OK [245 docvalues fields; 0 BINARY; 3 NUMERIC; 0 SORTED; 84 SORTED_NUMERIC; 158 SORTED_SET] [took 0.002 sec] test: points..............OK [76 fields, 116 points] [took 0.001 sec]

No problems were detected with this index.

Took 3.109 sec total.

I also tried to check the specific segment (_qwcf) reported by the exception but no issue reported for this segment as well.


java -cp lucene-core-8.10.1.jar org.apache.lucene.index.CheckIndex -segment _qwcf -slow /search/nodes/0/indices/A7Yxq0gHSz2ktlIv695d2Q/0/index/

Opening index @ /search/nodes/0/indices/A7Yxq0gHSz2ktlIv695d2Q/0/index/

0.00% total deletions; 12038 documents; 0 deleteions Segments file=segments_3m numSegments=11 version=8.10.1 id=8jru4jsnlmk6em810fnqva8cw userData={history_uuid=HJhQ41sOTDeVG_Jtb8yT1Q, local_checkpoint=1473044, max_seq_no=1473044, max_unsafe_auto_id_timestamp=-1, min_retained_seq_no=1461070, translog_uuid=TDZJxve2RLCYj-DpRFRAbw}

Checking only these segments: _qwcf: No problems were detected with this index.

Took 0.117 sec total.

Version and environment details

No response

swapnilsvaidya avatar Jun 06 '24 06:06 swapnilsvaidya