wmd icon indicating copy to clipboard operation
wmd copied to clipboard

there are lots of NaN's in the distance matrix for the example dataset

Open ilyaraz opened this issue 8 years ago • 1 comments

When I run the example script inside VMWare with Ubuntu installed as a guest OS, I get a matrix with around 100K NaN entries. Could it be a problem with the EMD solver?

ilyaraz avatar Feb 20 '17 20:02 ilyaraz

OK, I debugged a bit, and figured out that it happens whenever a tweet fully consists of stop words. Then, the bag of words is empty, and the EMD solver does not really like it.

ilyaraz avatar Feb 20 '17 20:02 ilyaraz