string_grouper icon indicating copy to clipboard operation
string_grouper copied to clipboard

Ngram re-use

Open hyshandler opened this issue 1 year ago • 1 comments

I'm building an app that runs match_strings with user-entered strings and a static set of strings. The static set of strings is stored in a feather file and pulled in via pandas each time the app is used and then chunked into ngrams of 3. It's a fairly large dataset so this takes some time. Since I'm using the same set of strings with the same ngram every time, I'm wondering if there's a way to save all of the ngrams in a file and simply feed those into a fuzzy match against the user-entered strings.

Thanks in advance!

hyshandler avatar Jun 07 '23 18:06 hyshandler

Wanted to follow up on this. Thanks!

hyshandler avatar Nov 03 '23 13:11 hyshandler