GraphSAGE about models.py

I find the tf.nn.fixed_unigram_candidate_sampler() function outputs negative samples including some positive samples.The outputs of negative sampling should not include any positive samples(included in true_classes). I am confused about this.

labels = [1,2,3]
batch_size = len(labels)
degs = np.array([3,2,3,4,2,1,1,5,5,4,6,2,1,7])
labels = tf.reshape(
    tf.cast(labels, dtype=tf.int64),
    [batch_size, 1])
neg_samples, _1, _2 = (tf.nn.fixed_unigram_candidate_sampler(
    true_classes=labels,
    num_true=1,
    num_sampled=4,
    unique=False,
    range_max=len(degs),
    distortion=0.75,
    unigrams=degs.tolist()))
with tf.Session() as sess:
     sess.run(tf.global_variables_initializer())
     print(sess.run(neg_samples))

Possible output:
1.[ 0  1  9 11]
2.[ 7  2 13  0]
3.[3 9 0 3]

Mar 18 '19 12:03 Achulei

That could be true. We are relying on the assumption that there is only a small probability where random sampling negatives will sample a positive example, when the entire graph dataset is much larger than the neighborhood computation graph. So in general this does not cause too much issue, while tremendously improves efficiency.

You can check PinSAGE about better negative sampling schemes.

May 28 '19 05:05 RexYing

Where to find PinSAGE? thanks.

Feb 11 '20 09:02 anny0316

The code for https://arxiv.org/pdf/1806.01973.pdf is not open-source due to corporate constraints. But the changes from GraphSAGE are described in paper. In the case of negative sampling, just need to implement a count-based PPR approximation.

Feb 11 '20 19:02 RexYing

GraphSAGE GraphSAGE copied to clipboard

about models.py

GraphSAGE
GraphSAGE copied to clipboard