pygcn icon indicating copy to clipboard operation
pygcn copied to clipboard

output = model(features, adj) Are test features involved in the trainning process?

Open houhouhouhou11 opened this issue 4 years ago • 4 comments

Hello,tkipf ! Thanks for your share! In the trainng process,features are 2708 dims, does it involve test samples? thank you very much!

houhouhouhou11 avatar Oct 23 '19 12:10 houhouhouhou11

Although the model uses all data as input, it calculates loss only from idx_train which defined in load_data. Hope this would help.

ChrisZhangcx avatar Nov 10 '19 04:11 ChrisZhangcx

@ChrisZhangcx Thanks for your answer ! In trainning procedure,if testing samples have edges with trainning samples ,the corresponding position in the adjacency matrix will become 1. the testing sample's feature will be involved in the trainning sample's output and in the training sample's loss. Thank you very much!

houhouhouhou11 avatar Nov 10 '19 06:11 houhouhouhou11

@houhouhouhou11

You are right. It does seem like we combine the information of both train and test data.

I try to think about this issue in a different way: we do need these edges in the adjacency matrix since our task is to give node representations according to each node's features as well as topology info of its neighbors.

By removing the relation edges between the train and test nodes (by manually masking their edges in adjacency matrix), there might occur some of the following issues:

  1. Change the distribution of data and features. Intuitively, we assume that both train and test data share the same distribution of the whole dataset.
  2. The graph is no longer fully connected. Especially we set train nodes only to 140 and the test to 1000 in this case. Some of the train nodes might even have no neighbors.
  3. Affect the information of the graph structure, from which we are able to extract features to further strengthen our model performance.

Thanks for your reply! I'm a starter in the graph network. Please let me know if you have any comments.

ChrisZhangcx avatar Nov 10 '19 07:11 ChrisZhangcx

@ChrisZhangcx Thanks for your reply ,I'm a starter too. Your answer is very reasonable . Whether we remove the relation edges between the train and test nodes or not,can I think it's a problem of transductive reasoning or inductive reasoning? In fact,I'm chinese too!^^^^^_^All in all,thank you very much!

houhouhouhou11 avatar Nov 10 '19 08:11 houhouhouhou11