dgl Different behavior of the MultiLayerNeighborSampler when the graph has one node type and edge type and when the graph is heterogenous.

The block graph does not contain dst_node_types in the case where the graphs one node type and edge type, which requires a lot of custom code for writing a unified API. Can you please advice whether this is by design or not?

https://github.com/dmlc/dgl/blob/92e77330659fe0f5673c76b966aad727752cb7ef/python/dgl/_dataloading/neighbor.py#L56

Jun 13 '22 23:06 bioannidis

Also the dataloader we define needs to be different other wise the sampling will fail.

When the ntypes and etypes are 1 we need to give the tensor with node ids. Otherwise we need to give a dictionary with key the node id and values the node ids. This does not allow us to unify the interface.

loader = dgl.dataloading.DistNodeDataLoader(g, train_idx
                                                    if len(g.etypes) == 1 and len(g.ntypes) == 1
                                                    else {predict_ntype: train_idx}, sampler,
                                                    batch_size=batch_size, shuffle=True, num_workers=0)

Jun 13 '22 23:06 bioannidis

The specific error is the following

    return forward_call(*input, **kwargs)
  File "/home/ivasilei/m5-gnn/python/m5gnn/model/rgcn_encoder.py", line 102, in forward
    return self._graph.number_of_nodes(self.get_ntype_id_from_dst(ntype))
  File "/home/ivasilei/.local/lib/python3.9/site-packages/dgl/heterograph.py", line 1239, in get_ntype_id_from_dst
    inputs_dst = {k: v[:g.number_of_dst_nodes(k)] for k, v in inputs.items()}
  File "/home/ivasilei/m5-gnn/python/m5gnn/model/rgcn_encoder.py", line 102, in <dictcomp>
    raise DGLError('DST node type "{}" does not exist.'.format(ntype))
dgl._ffi.base.DGLError: DST node type "queryasin" does not exist.
    inputs_dst = {k: v[:g.number_of_dst_nodes(k)] for k, v in inputs.items()}
  File "/home/ivasilei/.local/lib/python3.9/site-packages/dgl/heterograph.py", line 2414, in number_of_dst_nodes
    return self.num_dst_nodes(ntype)
  File "/home/ivasilei/.local/lib/python3.9/site-packages/dgl/heterograph.py", line 2476, in num_dst_nodes
    return self._graph.number_of_nodes(self.get_ntype_id_from_dst(ntype))
  File "/home/ivasilei/.local/lib/python3.9/site-packages/dgl/heterograph.py", line 1239, in get_ntype_id_from_dst
    raise DGLError('DST node type "{}" does not exist.'.format(ntype))
dgl._ffi.base.DGLError: DST node type "queryasin" does not exist.

Jun 13 '22 23:06 bioannidis

When the ntypes and etypes are 1 we need to give the tensor with node ids. Otherwise we need to give a dictionary with key the node id and values the node ids. This does not allow us to unify the interface.

This is by design because DGL will not know which node type the train_idx belongs to when the graph is heterogeneous. What's the ideal usage from your case?

BTW, you could use DGLGraph.is_homogeneous to check whether a graph is homogeneous or not (instead of using num_ntypes == 1 and num_etypes == 1.

Jun 14 '22 02:06 jermainewang

Ok thanks for this. I understand the need for the dictionary. The question is why it does not work with the dictionary when the graph is homogenous

Jun 14 '22 15:06 bioannidis

By the way this is a distGraph and hence AttributeError: 'DistGraph' object has no attribute 'is_homogeneous'

Jun 14 '22 18:06 bioannidis

Ok thanks for this. I understand the need for the dictionary. The question is why it does not work with the dictionary when the graph is homogenous

I see. I think the issue should be with DistDataLoader / DistGraph. For normal DataLoader, we support such usage. Let me verify this once I got time.

Jun 15 '22 14:06 jermainewang

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you

Jul 21 '22 01:07 github-actions[bot]

This issue has been automatically marked as stale due to lack of activity. It will be closed if no further activity occurs. Thank you

Aug 25 '22 01:08 github-actions[bot]

This issue is closed due to lack of activity. Feel free to reopen it if you still have questions.

Sep 01 '22 01:09 github-actions[bot]

dgl dgl copied to clipboard

Different behavior of the MultiLayerNeighborSampler when the graph has one node type and edge type and when the graph is heterogenous.

dgl
dgl copied to clipboard