lucene
lucene copied to clipboard
UnsupportedOperation when merging `Lucene90BlockTreeTermsWriter`
Description
Found this in the wild. I haven't been able to replicate :(
I don't even know what it means to hit this fst.outputs.merge branch and under what conditions is it valid/invalid. Any pointers here would be useful.
We ran into a strange postings merge error in production.
The FST compiler reaches the "merge" line when merging some segments:
https://github.com/apache/lucene/blob/4b94d97a26aabebfb301f1bb34af0b4cdc284e79/lucene/core/src/java/org/apache/lucene/util/fst/FSTCompiler.java#L933-L936
However, the "outputs" provided by Lucene90BlockTreeTermsWriter is ByteSequenceOutputs, which does not override merge, and thus throws an unsupported operation exception.
https://github.com/apache/lucene/blob/4b94d97a26aabebfb301f1bb34af0b4cdc284e79/lucene/core/src/java/org/apache/lucene/codecs/lucene90/blocktree/Lucene90BlockTreeTermsWriter.java#L534-L548
Given this, it seems like it should be "impossible" to reach the "Outputs.merge" path when merging with the Lucene90BlockTreeTermsWriter, but somehow it did.
Any ideas on where I should look?
at org.apache.lucene.util.fst.Outputs.merge(Outputs.java:95) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.util.fst.FSTCompiler.add(FSTCompiler.java:936) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter$PendingBlock.append(Lucene90BlockTreeTermsWriter.java:593) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter$PendingBlock.compileIndex(Lucene90BlockTreeTermsWriter.java:562) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter$TermsWriter.writeBlocks(Lucene90BlockTreeTermsWriter.java:776) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter$TermsWriter.finish(Lucene90BlockTreeTermsWriter.java:1163) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter.write(Lucene90BlockTreeTermsWriter.java:402) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:95) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:204) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:211) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:300) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5293) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4761) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6582) ~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:660) ~[lucene-core-9.11.1.jar:?]
at org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:134) ~[elasticsearch-8.15.0.jar:?]
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:721) ~[lucene-core-9.11.1.jar:?]```
### Version and environment details
Lucene 9.11.1