pytorch_geometric icon indicating copy to clipboard operation
pytorch_geometric copied to clipboard

Creating Subgraphs based on edge/node type with HeteroData

Open tiffaina opened this issue 3 years ago • 3 comments

🚀 The feature, motivation and pitch

Hi, I'm working on link prediction with a heterograph and trying to convert code in DGL to PyG. @rusty1s

  1. HeteroData with different types of edges, I want to get a subgraph that includes all nodes with that specific edge type. I wonder if PyG has a feature like this. DGL has dgl.edge_type_subgraph and I'd like to replicate this functionality in PyG. For example, if I want a subgraph in DGL I can do dgl.edge_type_subgraph(graph, [('author', 'writes', 'paper'),]) and it'd return a graph with all the 'writes' edge connections. PyG currently has a subgraph() function but it only works for nodes and you have to explicitly input which nodes you want to keep. Instead, I think with Heterodata you should also have this functionality for edges and have the option to just specify which node/edge type you want to keep.
  1. Quick accessing features- Is there a way to access all edges in a heterograph? DGL has g.edges() that can return a 2-tuple of 1D tensors (𝑈,𝑉), representing the source and destination nodes of all edges.
  1. In addition to your utils from_networkx(), a from_dgl() function would also be helpful :)

Alternatives

No response

Additional context

No response

tiffaina avatar Jul 28 '22 15:07 tiffaina

Thanks for the Issue.

  1. This is a good idea. I think it should straightforward to add an argument edge_list to subgraph and filter out edges not in edge_list.
  2. We don't have the exact same function. You could get something similar with HeteroData.edge_stores but this gives you other edge attributes too, which you could filter out.
  3. This is a nice idea. Not too familiar with DGL to make suggestion on how to go about it though.

Happy to accept PRs for 1 and 3.

wsad1 avatar Jul 29 '22 05:07 wsad1

  1. @wsad1 Do you wanna integrate it? :)
  2. We have data.edge_index_dict and data.to_homgeneous().edge_index which should be exactly what you want.
  3. I think this is a nice idea.

rusty1s avatar Jul 30 '22 15:07 rusty1s

Great, thank you! Please keep me updated :) @wsad1 @rusty1s

tiffaina avatar Aug 03 '22 17:08 tiffaina

Thanks. Let me know if the from_dgl ever gets integrated!

tiffaina avatar Aug 16 '22 14:08 tiffaina

Please consider contributing it as well if I don't find time to do it :)

rusty1s avatar Aug 16 '22 15:08 rusty1s