grape icon indicating copy to clipboard operation
grape copied to clipboard

Parallelized Embedding

Open bruriah1999 opened this issue 3 years ago • 5 comments

Hey, I'm trying to process a directed graph, the scales are about 5 million nodes and 100 million edges. I've managed to load the graph from a csv file, i get a very nice Graph object (within 5 minutes). I'm now trying to embedd the graph with grape.embedders.Node2VecSkipGramEnsmallen, but it doesn't seem to succeed, I've let it run for over 10 hours. In order to make it faster, i did enable the Graph's vector_source, vector_cumulative_node_degree and vector_reciprocal_sqrt_degrees. Reading your paper, it seems that the embedding process could be parallelized, but i can't find the way to do that. I'd appreciate if you could describe what part/s of the embedding process are parallelized? and how can i make it run in parallel? Thank you, Bruria.

bruriah1999 avatar Dec 15 '22 14:12 bruriah1999

Hi! I am not sure I understand the issue you have encountered. Would you be available to do a short call to investigate this?

LucaCappelletti94 avatar Dec 15 '22 14:12 LucaCappelletti94

If yes, I am available on the GRAPE Telegram group and Discord channel to set it.

LucaCappelletti94 avatar Dec 15 '22 14:12 LucaCappelletti94

trying the same with 5.24M nodes and 13.89M edges. Where can I find the TG or discord?

franz101 avatar Jan 09 '23 21:01 franz101

Both are available in the README

LucaCappelletti94 avatar Jan 09 '23 21:01 LucaCappelletti94

Hi @bruriah1999 were you able to solve your issue?

sanyabt avatar May 01 '24 16:05 sanyabt