string_grouper
string_grouper copied to clipboard
Ngram re-use
I'm building an app that runs match_strings with user-entered strings and a static set of strings. The static set of strings is stored in a feather file and pulled in via pandas each time the app is used and then chunked into ngrams of 3. It's a fairly large dataset so this takes some time. Since I'm using the same set of strings with the same ngram every time, I'm wondering if there's a way to save all of the ngrams in a file and simply feed those into a fuzzy match against the user-entered strings.
Thanks in advance!
Wanted to follow up on this. Thanks!