Multicore-TSNE icon indicating copy to clipboard operation
Multicore-TSNE copied to clipboard

update sklearn benchmark with 0.19.1

Open amueller opened this issue 8 years ago • 5 comments
trafficstars

Hi. Could you please try running your benchmark with scikit-learn 0.19.1? Thanks! Andy

amueller avatar Oct 27 '17 17:10 amueller

I think you results are because we used bad data structures and you ran out of ram... but hard to say.

amueller avatar Oct 27 '17 17:10 amueller

Hi, I've tested now with sklearn 0.19.1, the same parameters, as for this repo:

tsne = TSNE(n_components=2, random_state=0, n_iter = 1000, min_grad_norm=0, verbose=1000)

Still got very high running time:

[t-SNE] Iteration 1000: error = 3.1645739, gradient norm = 0.0000705 (50 iterations in 436.497s)
[t-SNE] Error after 1000 iterations: 3.164574
function took 15603291.888 ms

So it's 8730 sec for gradient descent and I assume 6873=15603-8730 has been spent on building QuadTree.

Does it sound correct?

DmitryUlyanov avatar Oct 29 '17 19:10 DmitryUlyanov

Hm that seems odd. Is the number of iterations comparable to the other implementations?

amueller avatar Oct 29 '17 20:10 amueller

Yes, it is the same. Can you please run https://github.com/DmitryUlyanov/Multicore-TSNE/blob/master/MulticoreTSNE/examples/test_sklearn.py on your machine?

DmitryUlyanov avatar Oct 30 '17 07:10 DmitryUlyanov

And that's directly comparable with the settings in https://github.com/DmitryUlyanov/Multicore-TSNE/blob/master/MulticoreTSNE/examples/test_py_bh_tsne.py ? If so, we need to open an issue at sklearn. Those two implementations should have the same speed from what I know. They even suggest using the sklearn implementation in their repo.

amueller avatar Oct 30 '17 17:10 amueller