Multicore-TSNE icon indicating copy to clipboard operation
Multicore-TSNE copied to clipboard

'No, no this should not happen' Happens

Open madkoppa opened this issue 6 years ago • 15 comments

Title is pretty self explanatory. I used your implementation a few weeks ago successfully and everything was perfect, but now when i installed this on another machine after a few iterations it starts spamming that particular error message. I have tried installing everything from scratch and nothing seems to work. I am using the same data as with the other machines.

Has something been changed? Its pretty silly, but this is the only TSNE implementation that i can find that wont take me a day per attempt.

madkoppa avatar Nov 24 '17 13:11 madkoppa

I checked out an earlier version, the commit with the merge of pull request #31 and it works fine. So somewhere in between then and now this error has arisen. It is usually happens between 200-1000 iterations if that helps at all.

EDIT: Also the number of cores does not affect this bug, it still happens even with 1.

madkoppa avatar Nov 24 '17 13:11 madkoppa

I will try to handle this next week.

DmitryUlyanov avatar Nov 27 '17 13:11 DmitryUlyanov

It happens to me sometimes as well. When the perplexity is high enough.

asanakoy avatar Dec 18 '17 00:12 asanakoy

I also see this error message (thousands of times), but the TSNE computation actually seems to finish normally nevertheless.

@asanakoy can you please elaborate? Are you saying that you can successfully avoid the error by setting the perplexity parameter appropriately?

dietmar avatar Dec 18 '17 09:12 dietmar

Hi, fixed a bug here https://github.com/DmitryUlyanov/Multicore-TSNE/commit/f5c5be16d0b4950ce3be3e2c393c96e0aa0015c5#diff-f8b3cce0b3183b4e02d913d9eb933c6eR210

Can you please do git pull and check if it solved the issue?

DmitryUlyanov avatar Dec 23 '17 13:12 DmitryUlyanov

@DmitryUlyanov, thank you. I will write you back when I try the update.

asanakoy avatar Dec 23 '17 20:12 asanakoy

@DmitryUlyanov I have reinstalled after doing a git pull, but I still get that message. If it helps, my matrix is very sparse [sparsity = 0.012031, Dimensions are: 2292 rows, 514 columns] and the perplexity I am using is 5.

zeneofa avatar Dec 27 '17 15:12 zeneofa

@DmitryUlyanov, I have encountered this warning message as well. And as @dietmar said, TSNE finished normally. Could you let me know if this message actually affect the TSNE results?

Thanks!

bli25wisc avatar Jan 09 '18 20:01 bli25wisc

Hello @DmitryUlyanov, I'm trying to use the TSNE and I get the same error! I have a 136500x1120 (sparsity = 0.001129) dataset and I am trying to run it this settings: using 48 cores, no_dims = 2, perplexity = 30.000000, and theta = 0.500000

TSNE builds the tree no problem, and then and a random iteration the 'No, no this should not happen' happens and never goes away.

I tested generating a random matrix of the same size and it works until the end without errors and running a subsection of my data (1000x1120) causes the same problem.

Are there any updates related to this issue?

So @fistR @DmitryUlyanov Is this a warning or an error ? Because it manages to run to the end and do all iterations but I don't know if I should trust the data.

Thanks!

AlexandreLaborde avatar Jan 18 '18 15:01 AlexandreLaborde

Adding a very small amount of noise to the data seems to solve the issue. Maybe it's just related to the amount of zeros in the dataset.

AlexandreLaborde avatar Jan 18 '18 19:01 AlexandreLaborde

@DmitryUlyanov, f5c5be1#diff-f8b3cce0b3183b4e02d913d9eb933c6eR210 didn't fix it.

I believe the problem is in numerical instability when the point is being checked to lie in the bounds of the child splittree.cpp#L149- > splittree.cpp#L110.

asanakoy avatar Jan 23 '18 17:01 asanakoy

Hi @DmitryUlyanov Just encountered this issue as well. Adding noise, as suggested by @AlexandreLaborde mat_noisy = mat + np.random.normal(loc=0,scale=0.001) didn't help in my case.

jayant91089 avatar Jan 26 '18 19:01 jayant91089

Hi @jayant91089, I just want to let you know that I added the noise exactly in the same way and with the same scale. My dataset has a lot of padding made by adding zeros, do you have a lot of zeros as well?

AlexandreLaborde avatar Jan 28 '18 18:01 AlexandreLaborde

Can confirm, happens to me too. If this is actually irrelevant, an option to suppress warnings instead of printing them would be nice.

rmitsch avatar Jan 30 '18 15:01 rmitsch

I get same warning when cloned from latest master, when use 'MulticoreTSNE' on default settings:

    tsne = MulticoreTSNE(n_jobs=multiprocessing.cpu_count(), random_state=0)
    Xpr = tsne.fit_transform(X)

mrgloom avatar Jan 31 '18 12:01 mrgloom