gae
gae copied to clipboard
A question about negative samples generation in preprocessing.py
Hi Thomas,
I'm confused when you generate the negative edge labels of validation set as:
val_edges_false = []
while len(val_edges_false) < len(val_edges):
idx_i = np.random.randint(0, adj.shape[0])
idx_j = np.random.randint(0, adj.shape[0])
if idx_i == idx_j:
continue
if ismember([idx_i, idx_j], train_edges):
continue
if ismember([idx_j, idx_i], train_edges):
continue
if ismember([idx_i, idx_j], val_edges):
continue
if ismember([idx_j, idx_i], val_edges):
continue
if val_edges_false:
if ismember([idx_j, idx_i], np.array(val_edges_false)):
continue
if ismember([idx_i, idx_j], np.array(val_edges_false)):
continue
val_edges_false.append([idx_i, idx_j])
However, the test negative set is confirmed by
if ismember([idx_i, idx_j], edges_all):
continue
Why does validation set use ismember([idx_j, idx_i], train_edges)
and ismember([idx_i, idx_j], val_edges)
instead of ismember([idx_i, idx_j], edges_all)
?
Wu Shiauthie
Hi, I had the same issue.
I gave it some thought, and I realized that the negative validation/training samples should be able to sample from the test's samples, otherwise the algorithm would have an edge over the test samples.
In other words, edges in the test set can be sampled as negative examples in the validation/training sets (this could happen in a real world scenario).
So, this explain why ismember is segregated in train_edges and val_edges. However, there is this line:
assert ~ismember(val_edges_false, edges_all)
Which I don't understand the purpose of.
I understand why assert error
appears sometimes when running the program. This is because val_edge_false
may appear in edges_all
.
File "train.py", line 47, in <module>
adj_train, train_edges, val_edges, val_edges_false, test_edges, test_edges_false = mask_test_edges(adj)
File "/home/lf/work/gae/gae/preprocessing.py", line 100, in mask_test_edges
assert ~ismember(val_edges_false, edges_all)
AssertionError
Hi,
I think a program without assert error
, that is, the correct code, is equivalent to the following code:
val_edges_false = []
while len(val_edges_false) < len(val_edges):
idx_i = np.random.randint(0, adj.shape[0])
idx_j = np.random.randint(0, adj.shape[0])
if idx_i == idx_j:
continue
if ismember([idx_j, idx_i], edges_all):
continue
if val_edges_false:
if ismember([idx_j, idx_i], np.array(val_edges_false)):
continue
if ismember([idx_i, idx_j], np.array(val_edges_false)):
continue
val_edges_false.append([idx_i, idx_j])
Hello, I am having the same issue. assert ~ismember(val_edges_false, edges_all) AssertionError Did anyone find the solution? Kindly help.