libTLDA icon indicating copy to clipboard operation
libTLDA copied to clipboard

divide by zero

Open TissueC opened this issue 5 years ago • 1 comments

When I use ImportanceWeightedClassifier (IW) iwe=kda, the console shows that C:\Users\Administrator\Anaconda3\envs\libtlda\lib\site-packages\scipy\stats\kde.py:262: RuntimeWarning: invalid value encountered in true_divide result = result / self._norm_factor

Besides, FLDA also has this problem (divided by zero). Is there something wrong in my dataset ?

Any comments will be appreciated.

TissueC avatar May 20 '19 14:05 TissueC

Hi TissueC, thanks for your interest in the library.

I think this is a numerical stability issue. Scipy's kde will only return a divide-by-zero if all estimated weights are 0, and that only occurs if every target point is so far away from a source point that exp(-distance(..)) is rounded off to 0. So, I suspect that your data sets are too far apart in feature space.

ImportanceWeightedClassifier cannot deal with data sets that don't overlap to at least some extent. I did not explicitly mention this anywhere, but I'll try to find a place to do that.

You say you get the same problem with FLDA? Can you tell me a bit more about how you're using it?

wmkouw avatar Jun 05 '19 22:06 wmkouw