SpectralNet icon indicating copy to clipboard operation
SpectralNet copied to clipboard

not able to reproduce paper results

Open ronslos opened this issue 7 years ago • 15 comments

I have been running the code with the default params, but don't get any substantial decrease in the loss and the results don't look anything like the ones that appear in the paper. Here is what I am getting on CC image

image

Please advise on what should be changed in order to achieve results such as in the paper. Thanks.

ronslos avatar Feb 08 '18 15:02 ronslos

What OS and versions of sklearn, numpy, tensorflow, and keras are you using? If there are any changes at all, can you try stashing them and running the CC code again?

lihenryhfl avatar Feb 08 '18 16:02 lihenryhfl

Here are all the packages I am using currently. image

I have not made any changes to the original code besides changing the name of the dataset and setting the active gpu index. If you can tell me which versions you are using on your system I can try to test with your configuration.

ronslos avatar Feb 08 '18 16:02 ronslos

That's strange. I have tested on two MacOS systems and one linux system, once using your version configurations, but I can't seem to reproduce the error.

I have a few more questions:

  1. What OS are you using?
  2. Are you using python 2 or python 3?
  3. Are you running run.py, with src/applications/ as your working directory?

EDIT: I was not running with the gpu build of tensorflow-1.5.0. Can you try running with no GPU, or downgrading to tensorflow-1.4.0, and running again?

lihenryhfl avatar Feb 08 '18 18:02 lihenryhfl

  1. I am running Ubuntu 16.04
  2. My python version is 3.6
  3. I ran run.py from the SpectralNet root directory

I have downgraded to tensorflow 1.4.0 cpu only as you suggested. Still no change.

ronslos avatar Feb 08 '18 21:02 ronslos

Everything else seems normal, except that we haven't tried this on a machine with Ubuntu 16.04 before. It would be strange if this were the issue, but please try running the cc code on a different machine (preferably a mac) if possible.

lihenryhfl avatar Feb 08 '18 21:02 lihenryhfl

It's working on mac. I wonder, what could cause it to vary between OS's? Thanks!

ronslos avatar Feb 08 '18 22:02 ronslos

No problem! Agreed, it's definitely still an issue, and a pretty big one, since with v1.5.0 tensorflow-gpu now implicitly requires Ubuntu 16.04. I'm currently working on gaining access to a machine with Ubuntu 16.04 and seeing if I can reproduce it on my end.

lihenryhfl avatar Feb 08 '18 22:02 lihenryhfl

I'm still working with your code. I managed to run the example datasets given in your code, but when I build my own dataset following an example given in your paper I get the following results:

image Here is the code I use to build this dataset

def generate_circles(n=2400, circles_num=3, noise_sigma=0.01, train_set_fraction=1.):

pts_per_cluster = int(n / circles_num)
initial_r = 1.0
r = initial_r
x = np.zeros([0,2]); y = np.zeros([0,1]);
# generate clusters
for i in range(circles_num):
    theta = (np.random.uniform(0, 1, pts_per_cluster) * 2* np.pi ).reshape(pts_per_cluster, 1)
    cluster = np.concatenate((np.cos(theta) * r, np.sin(theta) * r), axis=1)
    x = np.concatenate((x, cluster), axis=0)
    y = np.concatenate((y , i * np.ones(shape=(pts_per_cluster, 1))), axis=0)
    r -= initial_r/circles_num

# add noise to x
x = x + np.random.randn(x.shape[0], 2) * noise_sigma

# generate labels


# shuffle
p = np.random.permutation(n)
y = y[p]
x = x[p]

# make train and test splits
n_train = int(n * train_set_fraction)
x_train, x_test = x[:n_train], x[n_train:]
y_train, y_test = y[:n_train].flatten(), y[n_train:].flatten()

Basically I copied make_cc and modified it a bit. Can you please guide me on how to get SpectralNet to converge for this dataset?

ronslos avatar Mar 11 '18 13:03 ronslos

I'll look into this. But the most important hyperparameters to tweak are n_nbrs, scale_nbrs and affinity, so I'd recommend starting there.

lihenryhfl avatar Mar 14 '18 03:03 lihenryhfl

hello,

While running your Spectralnet code, it raised the same non-reproducible issue for me as well.

For "cc" dataset, I had pretty much similar result with the paper.

However, using the default code and hyper parameter settings, I got ACC 0.752, NMI 0.745 for mnist, and ACC 0.747, NMI 0.448 for reuters datasets, which is by far behind the numbers reported in the paper (ACC 0.971, NMI 0.924 for mnist, ACC 0.803, NMI 0.532 for reuters).

I found out that one of hyperparameters, "patience epochs", is not synced with Table 3 in the paper, which is also varying on which data you target for. After fixing the parameter equivalently to 10 as shown in the table, the accuracy goes up for mnist, but goes down for reuters such as ACC 0.791, NMI 0.791 for mnist, and ACC 0.619, NMI 0.328 for reuters.

Can you give advice to reach your reporting accuracy in the paper?

FYI, the os environment is CentOS Linux 7 with Tensorflow 1.9.0 and python 3.6.3

oj9040 avatar Aug 09 '18 20:08 oj9040

That's perplexing. I don't have access to a CentOS operating system at the moment. Can you try running this on Ubuntu or Mac?

lihenryhfl avatar Aug 13 '18 00:08 lihenryhfl

Thank you for the advice. Could you specify the detailed version of Ubuntu, Tensorflow, and Python that you have tested on?

oj9040 avatar Aug 14 '18 15:08 oj9040

I've tested on Python 3.4-3.6, Tensorflow 1.4-1.8, and Ubuntu 14.04, 16.04, and 18.04. I have also tried running Tensorflow 1.5 on Python 3.5 on macOS. I will try running on Tensorflow 1.9 by the end of this week and get back to you. In the meantime, if convenient, can you try one of these?

lihenryhfl avatar Aug 15 '18 15:08 lihenryhfl

I have tried python 3.5, tensorflow 1.4 in either Ubuntu 14.04.5 or Centos 7. The accuracy is now as expected: ACC 0.969 and NMI 0.921 (Ubuntu 14.04.5) ACC 0.97 and NMI 0.922 (Centos 7)

Based on this, OS seems not the reason to incur unreproducible issue, rather from python and tensorflow version.

oj9040 avatar Aug 18 '18 22:08 oj9040

python 3.6, tensorflow 1.4, keras2.1.6, Ubuntu 14.04 mnist->ACC: 0.97 NMI:0.923 reusters->ACC: 0.812 NMI:0.544 But it doesn't work on python 3.7, tensorflow 1.15, keras2.3 Ubuntu 14.04, I don't know why. 360截图17491102255147 360截图18490928707292

angus040107 avatar Jun 01 '20 06:06 angus040107