fuzzymatcher Division By Zero in def is

Hey,

I've been using your lib on 0.0.1 and just updated recently (I had to hack some of the SQLite fts keywords and will fix that up again) but I've come across a problem:

You get a div zero error in tokencomparison.py -> def is_mispelling(self, token1, token2)

Here are the values of the vars in that function when it throws:

float division by zero token1: 0 token2: 2 mis_t1: [] mis_t2: [] common: []

I know you're comparing distance for string tokens, but what is the logic behind numeric values? Whats the logic behind determining if two numbers are misspellings? (even ignoring the 0 value)

Even if you swap the max( ) / min ( ) to min ( ) / max ( ) and take the inverse you'll still get 0 for 0 values.

Maybe an absolute difference is better but that stuffs you up when there are addition errors (e.g. 1 typo to 10)

Maybe edit distance is still best used here?

As an aside, thanks for making this library; it's saved me some time so far :)

Jan 13 '18 13:01 gffde3

So I just set the exception for div 0 to return False. Seems to work alright.

Jan 18 '18 11:01 gffde3

I had this same issue but can't seem to replicate your fix. Do you mind posting the snippet of the is_mispelling function that you changed?

And thank you to you both, for making this package and working on this issue, as it would be a huge help.

Jan 24 '18 22:01 lalalandau

This bit seemed to work for me, though not sure if it is the most efficient:

        if (t1f == float(0)) | (t2f == float(0)):
            return False

        else:
            if max(t1f, t2f)/min(t1f, t2f) < self.number_fuzz_threshold:
                return True
            else:
                return False

Mar 12 '18 21:03 jacobod

I'm also getting the ZeroDivisionError and can't seem to figure out how to forego it while still returning the correctly linked dataframe. I saw the earlier comment mentioned changing the exception for div 0 to return False, and I would also like to see a snippet of what and how to fix the issue. I've tried to implement the snippet above, but same issue persisted.

May 02 '18 00:05 junaidahmed361

As pointed out by @gffde3, I added :

except ZeroDivisionError:
    pass

on line 40 of tokencomparison.py and it did the trick. 🎉

Sep 07 '18 14:09 ghost

As pointed out by @gffde3, I added :
except ZeroDivisionError:
    pass 
on line 40 of tokencomparison.py and it did the trick. 🎉

This work for me too, many thanks @gregobf

Oct 03 '18 07:10 kennethzhu88

I think changing line 42 to this is a little cleaner than adding a whole new exception line:

except (ValueError, ZeroDivisionError):

Dec 01 '18 20:12 chris1610

Closed by #43

Feb 22 '19 09:02 RobinL

Thanks @chris1610 and those for reporting

Feb 22 '19 09:02 RobinL

I am still getting this error despite the update to tokencomparison.py (error is a ZeroDivision error on line 40 as noted above). Note, I pip installed the package so perhaps that is the issue. Any help is much appreciated!

Mar 27 '19 13:03 7cb15

Same here, I used regular pip install and pulled from GitHub.

Apr 23 '19 14:04 ghost

Same here, in colab through pip

Apr 26 '22 08:04 kanlancb

fuzzymatcher
fuzzymatcher copied to clipboard

Division By Zero in def is_mispelling

fuzzymatcher fuzzymatcher copied to clipboard

Division By Zero in def is_mispelling

fuzzymatcher
fuzzymatcher copied to clipboard