flashtext issues

Results 70 flashtext issues

Sort by recently updated

Unable to find/replace '_' with ' ' even when removing '_' from non word boundaries

I'm currently processing a list of 100k+ texts. Regex is incredibly slow for this so I thought FlashText would be perfect. I'm unable to use FlashText to replace the '_'...

ezekielg

Total keywords count

It would be good to have a count getter to obtain the number of keywords processed by. Using the processor to identify the presence of the keywords, but still in...

andreamoro

Is the library still actively maintained?

Hi @vi3k6i5, Thanks for the wonderful library, it's really help a lot to faster the data preprocessing iteration. I plan to use this library for my internal text library, however...

jurukode

The target word suffix plus a number will cause the extraction to fail. >>> import flashtext >>> _extractor = flashtext.KeywordProcessor() >>> _extractor.add_keyword('地中海贫血') True >>> _extractor.extract_keywords('地中海贫血') ['地中海贫血'] >>> _extractor.extract_keywords('地中海贫血2') []

DotaArtist

can't search overlapped words?

kp = KeywordProcessor() kp.add_keyword("ABC DE") kp.add_keyword("DE FGHI") kp.extract_keywords("ABC DE FGHI") >>>['ABC DE'] why not ['ABC DE', 'DE FGHI']

xuexcy

Support Fuzzy Matching

Hi ! Thanks for this project :) It can be cool and amazing if you support the same algorithm from IBM Watson conversation when we activated the Fuzzy Matching Option...

Themandunord

adding mixed case-sensitive and case-insensitive keywords for same keywordprocessor

To extract mixed case-sensitive and case-insensitive keywords from text, is it possible to construct one keywordprocessor to handle both? Thanks,

pwyang123

span_info on combined unicode character(s)

Hello, I encountered an issue with `span_info=True` when used on a string with combined characters. As demonstration consider the following example: ```python import re from flashtext import KeywordProcessor from unicodedata...

kkaiser

can it deal with large keywords list?

keywordsList = ["java", "python"] keyword_processor.add_keywords_from_list(keywordsList ) if the length of keywordsList is Million level, keyword_processor.extract_keywords() will extracts nothing, how can it deal with Million level keywords list?

kongbb1

flashtext
flashtext copied to clipboard

Metadata

Unable to find/replace '_' with ' ' even when removing '_' from non word boundaries

Total keywords count

rm non_word_boundaries

Is the library still actively maintained?

Extract Keywords Bug

can't search overlapped words?

Support Fuzzy Matching

adding mixed case-sensitive and case-insensitive keywords for same keywordprocessor

span_info on combined unicode character(s)

can it deal with large keywords list?

← Metadata

Owner

Metadata

flashtext flashtext copied to clipboard

Metadata

← Metadata

Owner

Metadata

flashtext
flashtext copied to clipboard