infinity
infinity copied to clipboard

Published 20 hours ago •

Reame
Issues

[Feature Request]: Improve Chinese analyzer

Open yingfeng opened this issue 8 months ago • 0 comments

Is there an existing issue for the same feature request?

[X] I have checked the existing issues.

Describe the feature you'd like

Current Jieba analyzer for Chinese has several problems:

Stopwords are supported through external dictionaries, therefore the eventual outputs do not have continious offsets which will affect phrase queries.
For English tokens, stemmer is not used
Query segmentation has smaller granularity which does not have a smart policy, it will affect ranking for Chinese text

Jun 08 '24 14:06 yingfeng