analysis-ik icon indicating copy to clipboard operation
analysis-ik copied to clipboard

自定义英文分词不行额,不知道怎么设置呀,头大

Open chunpat opened this issue 2 years ago • 3 comments

环境 linux 版本elasticsearch-7.17.7

自定义

vim elasticsearch-7.17.7/config/analysis-ik/main1.dic

OPPO
VIVO
中国
阿
西
吧

analyze

curl -XGET "http://localhost:9200/model/_analyze" -H 'Content-Type: application/json;charset=utf-8' -d' { "text": "中国阿西吧OPPOVIVO", "analyzer": "ik_smart" }'

{
	"tokens": [{
		"token": "中国",
		"start_offset": 0,
		"end_offset": 2,
		"type": "CN_WORD",
		"position": 0
	}, {
		"token": "阿",
		"start_offset": 2,
		"end_offset": 3,
		"type": "CN_WORD",
		"position": 1
	}, {
		"token": "西",
		"start_offset": 3,
		"end_offset": 4,
		"type": "CN_WORD",
		"position": 2
	}, {
		"token": "吧",
		"start_offset": 4,
		"end_offset": 5,
		"type": "CN_WORD",
		"position": 3
	}, {
		"token": "oppovivo",
		"start_offset": 5,
		"end_offset": 13,
		"type": "ENGLISH",
		"position": 4
	}]
}

英文不生效,不知道怎么弄,请求大佬们啊

chunpat avatar May 18 '23 07:05 chunpat

同求这种case怎么处理

levylll avatar Mar 04 '24 08:03 levylll

楼主解决了吗

AriesYB avatar May 01 '24 03:05 AriesYB

后面用别的插件去处理了,不过是只支持全角的分词,我把英文都弄成全角,https://github.com/KennFalcon/elasticsearch-analysis-hanlp @AriesYB @levylll

chunpat avatar Jul 24 '24 10:07 chunpat