pygcn datasets partition

Hi tkipf, thanks for your sharing.

There are a total of 2708 lines in cora.content.However in utils.py,data division is as following:

idx_train = range(140) idx_val = range(200, 500) idx_test = range(500, 1500)

May I ask what is the reason for splitting in this way? Thank you

Jul 09 '18 08:07 2-4linqian

The specific splits are chosen arbitrarily, but consistent in size with the ones we use in the paper. For the specific splits, have a look at the TensorFlow GCN implementation under https://github.com/tkipf/gcn

On 9 Jul 2018, at 10:37, 2-4linqian [email protected] wrote:

Hi tkipf, thanks for your sharing.

There are a total of 2708 lines in cora.content.However in utils.py,data division is as follwing:

idx_train = range(140) idx_val = range(200, 500) idx_test = range(500, 1500)

May I ask what is the reason for splitting in this way? Thank you

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tkipf/pygcn/issues/12, or mute the thread https://github.com/notifications/unsubscribe-auth/AHAcYOPjCdpOB8CdQ3e-KkEsm6S5C1PJks5uExZigaJpZM4VHTEG.

Jul 10 '18 11:07 tkipf

@tkipf hi, tkipf.

After running this Pytorch version GCN, the accuracy I get (~83% on cora) is much higher than the results(~81%) reported in your paper(Tensorflow version), and the load_data function and the data format are also different from those in Tensoflow version.

I suspect the reason is that the training/test data splitting is different. When I change your load_data function, sometimes the accuracy is just ~77%. After random splitting 10 times, the average result is ~80%.

The 140 training nodes are not exactly those in the training data in Tensorflow version? The original training/test splitting in this Pytorch version is "better" than the that in your paper/Tensorflow version?

Your reply will be highly appreciated.

Thank you!

Dec 24 '18 13:12 JsonAC

When I trained, it stopped at 84.00% (below). I think the idea is you need to tweak the pytorch version yourself!

Optimization Finished! Total time elapsed: 2.1585s Test set results: loss= 0.7223 accuracy= 0.8400

Jan 13 '19 12:01 bapriddy

pygcn pygcn copied to clipboard

datasets partition

pygcn
pygcn copied to clipboard