elasticsearch-analysis-ansj
elasticsearch-analysis-ansj copied to clipboard
如果字段只索引不保存,覆盖的时候回报错
[2017-08-14 20:41:02,052][WARN ][cluster.action.shard ] [bd34] [nnewsindex][2] received shard failed for target shard [[nnewsindex][2], node[XrPm_SElSz2V262ysBmmAA], relocating [I0inuDUeR_OXHL4auxvQKQ], [R], v[10718], s[INITIALIZING], a[id=z8ta7k7HSP2nECAdZ2QClg, rId=tDL5rPFzS5KtSSztdBWW5w], expected_shard_size[410490]], indexUUID [VeqS8xejSAyfHVOUibm6Gw], message [failed to update mappings], failure [MapperParsingException[analyzer [index_ansj] not found for field [content]]]
MapperParsingException[analyzer [index_ansj] not found for field [content]]
at org.elasticsearch.index.mapper.core.TypeParsers.parseAnalyzersAndTermVectors(TypeParsers.java:213)
at org.elasticsearch.index.mapper.core.TypeParsers.parseTextField(TypeParsers.java:250)
at org.elasticsearch.index.mapper.core.StringFieldMapper$TypeParser.parse(StringFieldMapper.java:161)
at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:305)
at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:218)
at org.elasticsearch.index.mapper.object.RootObjectMapper$TypeParser.parse(RootObjectMapper.java:139)
at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:118)
at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:99)
at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:498)
at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:288)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.processMapping(IndicesClusterStateService.java:387)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyMappings(IndicesClusterStateService.java:348)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:164)
at org.elasticsearch.cluster.service.InternalClusterService.runTasksForExecutor(InternalClusterService.java:610)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:772)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
最开始插入索引时候没有判断是否已插入,其中content字段是只索引没有保存(exclude掉了),这样如果重复索引的时候就会报这个错,不知道是不是一个bug?
哪个版本出的问题?是集群还是单节点?配置文件呢?
@shi-yuan 自己编译的5.1.1版ansj,用于2.3.1版es。仔细确认了一下不是上面描述的问题,而是因为es没找到index_ansj。是在elasticsearch.yml文件中没有指定index_ansj,然后在index的setting中添加了自定义的两个analyzer,但是在mapping中用到了自定义的两个和ansj的,不知道是不是因为在setting中没有声明,导致找不到ansj的analyzer。
setting:
" \"index\": {" +
" \"analysis\": {" +
" \"analyzer\": {" +
" \"word_analyzer\": {" +
" \"type\": \"custom\"," +
" \"tokenizer\": \"word_tokenizer\"" +
" }," +
" \"id_analyzer\": {" +
" \"type\": \"custom\"," +
" \"tokenizer\": \"id_tokenizer\"" +
" }," +
//最开始觉得index和query两个analyzer已经向es注册,无需再指明
//所以在setting中没有写index_ansj和query_index这两个analyzer。导致找不到分词器
" \"index_ansj\": {" +
" \"type\": \"index_ansj\"," +
" }," +
" \"query_ansj\": {" +
" \"type\": \"query_ansj\"," +
" }" +
" }," +
" \"tokenizer\": {" +
" \"word_tokenizer\": {" +
" \"pattern\": \"\\\\s|,|,\"," +
" \"type\": \"pattern\"" +
" }," +
" \"id_tokenizer\": {" +
" \"pattern\": \"-\"," +
" \"type\": \"pattern\"" +
" }" +
" }" +
" }" +
" }" +
"}
我想的是 是不是如果自己在setting中指明了自定义的analyzer,那么他的优先级最高,导致找不到其实已经向es注册、但我没有写进自己setting中的其他analyzer。
.startObject("content").field("type", "string").field("index", "analyzed").field("analyzer", "index_ansj").field("search_analyzer","query_ansj").endObject()
.startObject("newsid").field("type", "string").field("index", "analyzed").field("analyzer", "id_analyzer").endObject()
另外还有一个问题,就是因为你的2.3.1版es_ansj插件采用的是3.X版本的ansj_seg 存在数组越界错误,所以我自己编译的5.1.1版本用在2.3.1es_ansj上,但是5.1版本ansj好像不会读取StopLibrary,即使在library.properties 配置了,停用词典也不会生效。是我配置错了嘛
应该是配置的有问题,是怎么配置的?