Max Bachmann comments

Results 261 comments of


                                            Max Bachmann

Performance Optimization - Fail Fast

Your right in FuzzyWuzzy score_cutoff is just a filter. So it will calculate the results and filter them afterwards. [RapidFuzz](https://github.com/maxbachmann/rapidfuzz) uses exactly the behaviour you describe to improve the performance...

Clarification between the Levenshtein Distance algorithm and how it is implemented here

The Levenshtein distance implementation uses the following weights: - insertion: 1 - deletion: 1 - replacement: 2 and normalizes the result the following way: `100 * (1 - distance /...

Implemented sort order matches by common letter count largest to smallest

Overall I am personally not really convinced, this should be added at all for two reasons 1) it adds more arguments, which makes it increasingly hard to use the function...

user process.extract for chinese returns wrong result

The default scorer that is selected by process.extract is `fuzz.Wratio`, which by default converts all non ascii characters to whitespaces and trims them. So in your case your comparing empty...

Partial_Ratio not working

partial_ratio searches for the best alignment between two strings and the calculates the `fuzz.ratio` for this alignment. So while in @aW3st case the word 'etf' is part of the second...

Fix some code quality issues

Following https://github.com/seatgeek/fuzzywuzzy/pull/254 in February I doubt that the repository owners want to adopt deepsource ;)

process is slow

You should provide examples for people to reproduce your test. Some of the interesting points: - what's the ratio you used with process - did you use the faster version...

process is slow

I performed a quick test using the following test code: ```python setup=""" from {} import process, fuzz with open("Lorem-ipsum-dolor-sit-amet.txt") as fw: text = fw.read() words = text.split() query = words[0]...

Optimized fuzz.py

@josegonzalez It's a line count optimisation. However in my opinion at least the string joining hurts readability.

How to get fuzzy index?

fuzz.partial_ratio searches for the best alignment of the shorter string to the longer string. It does not matter which way you insert them in as long as they do not...