flashtext icon indicating copy to clipboard operation
flashtext copied to clipboard

Not working with parallel processing framework like Dask

Open mittalsuraj18 opened this issue 5 years ago • 4 comments

mittalsuraj18 avatar Dec 05 '18 07:12 mittalsuraj18

Can you give an example of your implementation?

ssameerr avatar Jan 07 '19 10:01 ssameerr

import dask.bag as db
from flashtext import KeywordProcessor
processor = KeywordProcessor()
string_list = [“new york is a big city”,”apple is a fruit”]
bags = db.from_sequence(string_list,npartitions=2)
extracted_words = bags.map(processor.extract_keywords)
extracted_words.compute()

Ran with python 3.6

mittalsuraj18 avatar Jan 07 '19 14:01 mittalsuraj18

You have to add the keywords first.

giriannamalai avatar Mar 14 '20 16:03 giriannamalai

Hi @mittalsuraj18 , as mentioned by @giriannamalai , you have to add keywords, eg. by calling

processor.add_keyword("New York")

Rémi

remiadon avatar May 07 '21 21:05 remiadon