tsnejs icon indicating copy to clipboard operation
tsnejs copied to clipboard

Trouble with NaNs after step function updates

Open domluna opened this issue 10 years ago • 6 comments

Some background: I'm trying to visualize my Spotify playlists but I'm having some trouble getting going here.

My data consists of 267 songs, each song has 10 features. Here's a sample (ignore the artist and title fields).

{ "artist":"Drake", "audio_summary.acousticness":0.016128527, "audio_summary.danceability":0.3236382, "audio_summary.energy":0.8417243, "audio_summary.key":7, "audio_summary.liveness":0.13018084, "audio_summary.loudness":-5.548, "audio_summary.mode":1, "audio_summary.speechiness":0.0, "audio_summary.tempo":98.39, "audio_summary.time_signature":5, "title":"Over" }

I'm passing the data as a 267 element array, each element is a 10 element array. I'm using initDataRaw to initialize but I've tried both init methods.

The problem is even after just one call to the step function, getSolution returns [NaN, NaN].

Now I had this problem originally but I switched from initDataDist to initDataRaw, that seemed to avoid the NaN. The visualization I got though was off. I wish I had taken a picture because I'm having trouble reproducing it due to NaN issues but essentially the songs spread out on a diagonal in a line as if it was being compressed to 1D.

I thought maybe the issue were some fields having values much larger than others, tempo for example. So I normalized all the features and then came the NaN problem. The weird this is that even my old non-normalized data is giving me NaNs now!

Any ideas of what I'm doing wrong? Tips for getting the data setup in general (avoiding NaNs)?

Thanks!

domluna avatar Feb 20 '15 04:02 domluna

I am having the same problem with NaNs on the attached data set:

d.txt

If I reduce the number of input rows to 194 (total is 324), then it works. Any suggestions?

OrKoN avatar Jun 14 '16 12:06 OrKoN

Hmm... I also get NaNs but when i feed it >100 rows.

The number of parameters in each row doesn't seem to change anything. The number of steps doesn't seem to be the problem for me but rather the number of input rows. I've also tried to shift the data points i feed it, no change either: it weirdly breaks after 100, no matter what it is.

I actually tried to feed it the same row multiple times and it makes things worse: it NaNs when getting more than 3 identical points. I ran the example code with this:

var dists = [[1.0, 0.1, 0.2], [1.0, 0.1, 0.2], [1.0, 0.1, 0.2], [1.0, 0.1, 0.2]];

and it gives:

[ [ NaN, NaN ], [ NaN, NaN ], [ NaN, NaN ], [ NaN, NaN ] ]

My dataset definitely contains very similar points. Could it be the cause ?

In nodeJS 0.12.4 and 5.8.0

lipsumar avatar Aug 14 '16 04:08 lipsumar

@lipsumar try this module instead https://github.com/scienceai/tsne-js

OrKoN avatar Aug 14 '16 09:08 OrKoN

I had a similar problem and I believe I found the reason. Just submitted a pull request.

piotrgrudzien avatar Oct 22 '16 11:10 piotrgrudzien

Should this be closed now?

pengowray avatar Sep 22 '18 01:09 pengowray

I am still seeing this issue. Has this been fixed?

Shravya-M avatar Mar 26 '20 06:03 Shravya-M