Max Bachmann
Max Bachmann
Your right in FuzzyWuzzy score_cutoff is just a filter. So it will calculate the results and filter them afterwards. [RapidFuzz](https://github.com/maxbachmann/rapidfuzz) uses exactly the behaviour you describe to improve the performance...
The Levenshtein distance implementation uses the following weights: - insertion: 1 - deletion: 1 - replacement: 2 and normalizes the result the following way: `100 * (1 - distance /...
Overall I am personally not really convinced, this should be added at all for two reasons 1) it adds more arguments, which makes it increasingly hard to use the function...
The default scorer that is selected by process.extract is `fuzz.Wratio`, which by default converts all non ascii characters to whitespaces and trims them. So in your case your comparing empty...
partial_ratio searches for the best alignment between two strings and the calculates the `fuzz.ratio` for this alignment. So while in @aW3st case the word 'etf' is part of the second...
Following https://github.com/seatgeek/fuzzywuzzy/pull/254 in February I doubt that the repository owners want to adopt deepsource ;)
You should provide examples for people to reproduce your test. Some of the interesting points: - what's the ratio you used with process - did you use the faster version...
I performed a quick test using the following test code: ```python setup=""" from {} import process, fuzz with open("Lorem-ipsum-dolor-sit-amet.txt") as fw: text = fw.read() words = text.split() query = words[0]...
@josegonzalez It's a line count optimisation. However in my opinion at least the string joining hurts readability.
fuzz.partial_ratio searches for the best alignment of the shorter string to the longer string. It does not matter which way you insert them in as long as they do not...