flashtext icon indicating copy to clipboard operation
flashtext copied to clipboard

Extract Keywords Bug

Open DotaArtist opened this issue 4 years ago • 1 comments

The target word suffix plus a number will cause the extraction to fail.

import flashtext _extractor = flashtext.KeywordProcessor() _extractor.add_keyword('地中海贫血') True _extractor.extract_keywords('地中海贫血') ['地中海贫血'] _extractor.extract_keywords('地中海贫血2') []

DotaArtist avatar Jul 19 '19 03:07 DotaArtist

FlashText is designed to only match complete words (words with boundary characters on both sides)

https://arxiv.org/pdf/1711.00046.pdf

rmz59 avatar Sep 12 '19 20:09 rmz59