analysis-pinyin icon indicating copy to clipboard operation
analysis-pinyin copied to clipboard

严重BUG:当分词内容中包含单独的A字母时,这个A字母会被分词器扔掉

Open Dustone-JavaWeb opened this issue 1 year ago • 1 comments

GET /_analyze { "analyzer" : "ik_smart", "text" : "我们A A制" } { "tokens": [ { "token": "我们", "start_offset": 0, "end_offset": 2, "type": "CN_WORD", "position": 0 }, { "token": "制", "start_offset": 5, "end_offset": 6, "type": "CN_CHAR", "position": 1 } ] }

Dustone-JavaWeb avatar Apr 11 '23 09:04 Dustone-JavaWeb

ik默认会加载一个停用词典stopword.dic,里面包含字母'a'(在英文中被认为是停用词),所以会被过滤掉,把ik目录下/config/stopword.dic清空就可以了

wangming31 avatar Jul 04 '23 03:07 wangming31