nodejieba icon indicating copy to clipboard operation
nodejieba copied to clipboard

能否取消对英文点 “.” 的分词?

Open colerape opened this issue 7 years ago • 3 comments

想直接搜索关键词 "www.baidu.com",但是内容却会被分为: "." "www" "baidu" "com",以至于直接搜索的时候搜索不到。对于内容是: t.baidu.com; a.baidu.com; t.sohu.com; a.sohu.com 的时候,由于点被分隔了,在存在大量域名的时候,基本查不到信息。

colerape avatar Jun 07 '17 08:06 colerape

英文的建议先特殊处理一下再进行中文分词,否则badcase很多。

yanyiwu avatar Aug 05 '17 05:08 yanyiwu

谢谢,后来问题结局了。

colerape avatar Aug 14 '17 05:08 colerape

This issue has not been updated for over 5 years and will be marked as stale. If the issue still exists, please comment or update the issue, otherwise it will be closed after 7 days.

github-actions[bot] avatar Sep 07 '24 13:09 github-actions[bot]