csyjgu
csyjgu
@ArvinZhuang 1. yes. 2. yes. it has no big influence on title because title usually has no `','`. but is has influence on content since content is long and has...
@wshuai190 you can try to use recall_v3.zip. if the problem is still there, you can send me the code (or lines number) which case the problem.
"江苏省" 用ik_smart的分词结果只有一个词”江苏省“。所以用”江“或”江苏“都匹配不上。 这种情况需要把词库中的“江苏省”甚至“江苏”都删除。 如果想要的分词效果是拆成单个字,可以用standard分词。但是这样分词的结果就没有词语了,全是单字。 如果分词结果既想要单字,又想要词语,目前的分词好像达不到。