pytorch_geometric icon indicating copy to clipboard operation
pytorch_geometric copied to clipboard

About GraphSAGE sampling on weighted graph

Open jindeok opened this issue 4 years ago • 19 comments

❓ Questions & Help

Hello. I really appreciate for you to share such a implementation examples of GNN, However, I have a 1 short question.

I'm wondering what happens when I put weighted graph into GraphSAGE example. Does NeighborhoodSampling consider It as unweighted(binary edge) graph? or weight of edges affect the sampling process?

jindeok avatar Jan 02 '21 12:01 jindeok

Currently, SAGEConv does not support weighted graphs, but GraphConv does (which is quite similar). Note that you need to pas both edge_index and edge_weight to the GNN op.

rusty1s avatar Jan 02 '21 13:01 rusty1s

Can we use sparse matrix for the edge weights?

1byxero avatar Jan 03 '21 04:01 1byxero

Do you mean using the SparseTensor class? That is possible by passing edge weights to the value argument:

adj_t = SparseTensor(row=row, col=col, value=edge_weight, sparse_sizes=(N, N))

rusty1s avatar Jan 03 '21 11:01 rusty1s

This means we need one dense tensor storing edge weights initially. Right?

I don't want to do that as that dense tensor is taking up too much GPU memory. Is there any workaround?

1byxero avatar Jan 04 '21 07:01 1byxero

Thank you for your reply.!! It was very helpful.

Can I ask one more question? I'm also wonder whether I can apply NeighborSampler function on weighted graph if I want to construct training set from sampling.

jindeok avatar Jan 04 '21 07:01 jindeok

NeighborSampler returns a e_id tensor, which can be used to index the original edge weights:

loader = NeighborSampler(edge_index, ...)
for batch_size, n_id, adjs:
    for edge_index, e_id, size in adjs:
        sampled_edge_weight = edge_weight[e_id]

rusty1s avatar Jan 04 '21 15:01 rusty1s

You mean edge_weight = data.edge_attr in here??

Actually I built weighted graph upon networkx graph than I convert it to torch_geometric data using from_networkx() function and I think that this converting method do not support to also transfer edge attribute from networkx graph. (Is only structural information of graph converted?)

In short, Now I am stuck in converting networkx weighted graph into torch_geometric weighted graph data.. Can you give me some tips on it??

Thanks for reading it!

jindeok avatar Jan 07 '21 09:01 jindeok

Can you give me a short example to illustrate this issue?

rusty1s avatar Jan 07 '21 10:01 rusty1s

Im sorry I found that from_networkx()method also convert edge weight. anyway,

my work example]

  • I want to aggregate information from sampled neighbor batch on weighted graph *

(In graphsage_unsupervised example code, I just replaced SAGEConv into GraphConv here)

class myGNN(nn.Module):
    def __init__(self, in_channels, hidden_channels, num_layers):
        super(myGNN, self).__init__()
        self.num_layers = num_layers
        self.convs = nn.ModuleList()
        for i in range(num_layers):
            in_channels = in_channels if i == 0 else hidden_channels
            self.convs.append(**GraphConv**(in_channels, hidden_channels, aggr = 'mean'))

    def forward(self, x, adjs):
        for i, (edge_index, _, size) in enumerate(adjs):
            x_target = x[:size[1]]  # Target nodes are always placed first.
            x = self.convs[i]((x, x_target), edge_index) 
            if i != self.num_layers - 1:
                x = x.relu()
                x = F.dropout(x, p=0.5, training=self.training)
        return x

    def full_forward(self, x, edge_index):
        for i, conv in enumerate(self.convs):
            x = conv(x, edge_index) 
            if i != self.num_layers - 1:
                x = x.relu()
                x = F.dropout(x, p=0.5, training=self.training)
        return x


data = from_networkx(G)  # G: Weighted Graph that constructed in networkx
train_loader = NeighborSampler(data.edge_index, sizes=[10, 10], batch_size=256,
                               shuffle=True, num_nodes=data.num_nodes)

model.train()

total_loss = 0
for batch_size, n_id, adjs in train_loader:
      for edge_index, e_id, size in adjs:
        sampled_edge_weight = edge_weight[e_id]   # I stuck in this part(this code does not work)
        adjs = [adj.to(device) for adj in adjs]
        optimizer.zero_grad()
        out = model(....)
        ...

  1. How to design GNN forward method that I can input weighted graph?
  2. As for training phase, how can I get weight information from train_loader?

jindeok avatar Jan 07 '21 10:01 jindeok

You need to index select the edge weights coming from data.edge_weight. The correct example would look similar to:

def forward(self, x, adjs, edge_weight):
        for i, (edge_index, e_id, size) in enumerate(adjs):
            x_target = x[:size[1]]  # Target nodes are always placed first.
            x = self.convs[i]((x, x_target), edge_index, edge_weight[e_id]) 
            if i != self.num_layers - 1:
                x = x.relu()
                x = F.dropout(x, p=0.5, training=self.training)

for batch_size, n_id, adjs in train_loader:
    ...
    model(x[n_id], adjs, data.edge_weight)

rusty1s avatar Jan 07 '21 21:01 rusty1s

Currently, SAGEConv does not support weightes graphs, but GraphConv does (which is quite similar). Note that you need to pas both edge_index and edge_weight to the GNN op.

For the inductive case it seems SAGEConv performs better than GraphConv. Is there any possibility to extend the SAGEConv with edge weights? And after reading the paper of GraphConv I am not sure why it should be similar to SAGEConv?

pintonos avatar Jan 11 '22 09:01 pintonos

GraphConv is the same as SAGEConv in case you specify aggr="mean". Let me know if that works for you.

rusty1s avatar Jan 11 '22 11:01 rusty1s

GraphConv is the same as SAGEConv in case you specify aggr="mean". Let me know if that works for you.

Works fine now, thanks! The documentation of GraphConv only defines its node-wise formulation. What would be the definition for the whole graph as in GCNConv for instance?

pintonos avatar Jan 18 '22 09:01 pintonos

X @ W_1 + D^{-1} A X W_2

rusty1s avatar Jan 18 '22 09:01 rusty1s

You need to index select the edge weights coming from data.edge_weight. The correct example would look similar to:

def forward(self, x, adjs, edge_weight):
        for i, (edge_index, e_id, size) in enumerate(adjs):
            x_target = x[:size[1]]  # Target nodes are always placed first.
            x = self.convs[i]((x, x_target), edge_index, edge_weight[e_id]) 
            if i != self.num_layers - 1:
                x = x.relu()
                x = F.dropout(x, p=0.5, training=self.training)

for batch_size, n_id, adjs in train_loader:
    ...
    model(x[n_id], adjs, data.edge_weight)

I have a question about,How to generate positive and negative samples by using NeighborSampler Sampling on weighted graph?

in this code pos_batch = random_walk(row, col, batch, walk_length=1,coalesced=True)[:, 1],how to randow walk according to the edge weight?

kzh2ang avatar Apr 25 '22 13:04 kzh2ang

We do not have support for biased/weighted sampling of random walks yet, I am sorry.

rusty1s avatar Apr 26 '22 07:04 rusty1s

You need to index select the edge weights coming from data.edge_weight. The correct example would look similar to:

def forward(self, x, adjs, edge_weight):
        for i, (edge_index, e_id, size) in enumerate(adjs):
            x_target = x[:size[1]]  # Target nodes are always placed first.
            x = self.convs[i]((x, x_target), edge_index, edge_weight[e_id]) 
            if i != self.num_layers - 1:
                x = x.relu()
                x = F.dropout(x, p=0.5, training=self.training)

for batch_size, n_id, adjs in train_loader:
    ...
    model(x[n_id], adjs, data.edge_weight)

I have a question about,How to generate positive and negative samples by using NeighborSampler Sampling on weighted graph?

in this code pos_batch = random_walk(row, col, batch, walk_length=1,coalesced=True)[:, 1],how to randow walk according to the edge weight?

Hi, I have the same need as you, how did you solve it in the end?

ZRH0308 avatar Nov 18 '22 13:11 ZRH0308

We do not have support for biased/weighted sampling of random walks yet, I am sorry.

Hi, @rusty1s Now is it possible to do weighted sampling of random walks? I need to sample positive neighbors based on the edge_weight.

If I just need to take an one step neighbor, can I directly select the TOP neighbor with the largest edge_weight in the current batch for each node as a positive sample? Do you think it will work?

ZRH0308 avatar Nov 19 '22 05:11 ZRH0308

There is an open PR for this out, see https://github.com/rusty1s/pytorch_cluster/pull/140. Maybe you can check it out to see if it fits your need.

rusty1s avatar Nov 21 '22 16:11 rusty1s