pygcn
pygcn copied to clipboard
datasets partition
Hi tkipf, thanks for your sharing.
There are a total of 2708 lines in cora.content.However in utils.py,data division is as following:
idx_train = range(140) idx_val = range(200, 500) idx_test = range(500, 1500)
May I ask what is the reason for splitting in this way? Thank you
The specific splits are chosen arbitrarily, but consistent in size with the ones we use in the paper. For the specific splits, have a look at the TensorFlow GCN implementation under https://github.com/tkipf/gcn
On 9 Jul 2018, at 10:37, 2-4linqian [email protected] wrote:
Hi tkipf, thanks for your sharing.
There are a total of 2708 lines in cora.content.However in utils.py,data division is as follwing:
idx_train = range(140) idx_val = range(200, 500) idx_test = range(500, 1500)
May I ask what is the reason for splitting in this way? Thank you
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tkipf/pygcn/issues/12, or mute the thread https://github.com/notifications/unsubscribe-auth/AHAcYOPjCdpOB8CdQ3e-KkEsm6S5C1PJks5uExZigaJpZM4VHTEG.
@tkipf hi, tkipf.
After running this Pytorch version GCN, the accuracy I get (~83% on cora) is much higher than the results(~81%) reported in your paper(Tensorflow version), and the load_data function and the data format are also different from those in Tensoflow version.
I suspect the reason is that the training/test data splitting is different. When I change your load_data function, sometimes the accuracy is just ~77%. After random splitting 10 times, the average result is ~80%.
The 140 training nodes are not exactly those in the training data in Tensorflow version? The original training/test splitting in this Pytorch version is "better" than the that in your paper/Tensorflow version?
Your reply will be highly appreciated.
Thank you!
When I trained, it stopped at 84.00% (below). I think the idea is you need to tweak the pytorch version yourself!
Optimization Finished!
Total time elapsed: 2.1585s
Test set results: loss= 0.7223 accuracy= 0.8400