graph-data-science icon indicating copy to clipboard operation
graph-data-science copied to clipboard

Support source and target node type specification for link prediction

Open devineyfajr opened this issue 3 years ago • 3 comments
trafficstars

Is your feature request related to a problem? Please describe.

Sometimes we need to predict links between two different node types, say between :Identity nodes and :Organization nodes. Who belongs to an organization? Currently we can create a graph that has all the Identity and Organization nodes, and all the (i:Identity)-[r:BELONGS_TO]->(o:Organization) relationships, but it appears that when negative edges are sampled, they include (O,I), (O,O), and (I,I) pairs, when we really only want (I,O) pairs.

Also, in the prediction phase, it appears predictions are made for all four combinations, instead of just (I,O) pairs. This unnecessarily increases runtime.

Describe the solution you would like

Add sourceNodeTypes and targetNodeTypes variables somewhere in the pipeline configuration. They could default to all node types if not specified. Then modify the negative sampling and prediction routines to use them.

Describe alternatives you have considered

Additional context

devineyfajr avatar May 10 '22 13:05 devineyfajr

Hi @devineyfajr, This is a nice feature request - thank you. We have discussed this internally too, and it's in our backlog. Hopefully we can get to it soon! We will keep you posted. Adam

adamnsch avatar May 23 '22 13:05 adamnsch

Hi @devineyfajr,

Just to give you an update, this feature will be included in the upcoming GDS 2.2 release.

Adam

adamnsch avatar Aug 16 '22 07:08 adamnsch

Cool

On 2022-08-16 03:06, Adam Schill Collberg wrote:

Hi @devineyfajr [1],

Just to give you an update, this feature will be included in the upcoming GDS 2.2 release.

Adam

-- Reply to this email directly, view it on GitHub [2], or unsubscribe [3]. You are receiving this because you were mentioned.Message ID: @.***>

-- Frank A Deviney Jr, PhD Data Scientist @.*** www.ccri.com [4] A Bicycle Friendly Business

Links:

[1] https://github.com/devineyfajr [2] https://github.com/neo4j/graph-data-science/issues/192#issuecomment-1216232226 [3] https://github.com/notifications/unsubscribe-auth/AAKRPQGENHJLDWOUWP53VWDVZM4WLANCNFSM5VRQO6AQ [4] http://www.ccri.com

devineyfajr avatar Aug 16 '22 14:08 devineyfajr

Awesome @adamnsch. I could really use this feature right now as well. Is there an approximate date for when the GDS 2.2 update will take place?

gurugecl avatar Aug 18 '22 11:08 gurugecl

@gurugecl The current approximation is End of September.

FlorentinD avatar Aug 18 '22 11:08 FlorentinD

@gurugecl 2.2.0 is now released :)

FlorentinD avatar Oct 10 '22 12:10 FlorentinD