elasticsearch-analysis-mmseg
elasticsearch-analysis-mmseg copied to clipboard
支持ES6.X的Mmseg还会开发么?
我们生产的ES分词一直用的Mmseg,因为升级的需要,我们自己改了一下插件源码想让其能够兼容6.X的ES。但是在做reindex的时候,会有offset的问题。报错如下,问题原因跟这个issue类似([https://github.com/medcl/elasticsearch-analysis-pinyin/issues/143] java.lang.IllegalArgumentException: startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards startOffset=0,endOffset=2,lastStartOffset=6 for field 'list_number' at org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:767) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43] at org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:430) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43] at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:392) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43] at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:240) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43] ……