tulip icon indicating copy to clipboard operation
tulip copied to clipboard

tulip-python, deleting graph and freeing memory

Open LoannGio opened this issue 4 years ago • 2 comments

Hi,

I'm using the tulip-python 5.4.0 package to generate thousands of graphs and compute nodes metrics for each of them. However, the process is too memory heavy (about 13 Go RAM for ~10k graphs with <128 nodes). Here's a snippet of code I'm looping over:

#Gen graph
g = tlp.importGraph(graphGen, genParams)

#Compute metrics
eccentricity = applyTlpAlgorithm_Double(g,"Eccentricity", property_name="res_eccentricity", params=algoParams)
closeness_centrality = applyTlpAlgorithm_Double(g,"Eccentricity", property_name="res_closenness_centrality", params=algoParams)
k_cores = applyTlpAlgorithm_Double(g,"K-Cores", property_name="res_kcores")

#Save metrics
for n in g.nodes():
    nodes_Features.append([
        eccentricity.getNodeValue(n), 
        closeness_centrality.getNodeValue(n),
        k_cores.getNodeValue(n)
    ])

#Try to free some memory, still use >13 Go RAM for 10k graphs
del g
gc.collect()

Is there a clean way to destroy a graph object in tulip-python so that the memory is freed ? I.e. a python version of or call to the C++ deletion ?

Thanks

LoannGio avatar Nov 25 '20 13:11 LoannGio

Hello Loann, Quick answer. First, in Python, do not use g.nodes but g.getNodes(). I am unsure that g.nodes() is as clean in Python as in C++. Limit the scope of g in a function or a block to help the garbage collector. It seems that you do not need g anymore after your for loop. In python as in Java, you do not have to clean an object. Just limit the scope of it to the minimum (good practise in C++ too). If g is in the global scope of your script, the garbage collector cannot clean it because you can use it (Python is not compiled, so a future use cannot be guessed). If g is only in the scope of a function, it will be cleaned more easily when leaving the function (as long as there is no reference to g like a=g then return a. Python is only references).

Hope this helps.

bpinaud avatar Nov 25 '20 13:11 bpinaud

I've tried encapsulating g in a function, ending up with my loop:


for _ in range(n_graphs):
    n_nodes = random.randint(min_nodes, max_nodes)
    
    nodes_Features.append(gen_features(graphGen, n_nodes)) # encapsulating method

g is created in gen_features, is never referenced elsewhere, is never affected to another variable and is unallocated before the method's end (g = None). Still, the garbage collector does not seem to free the memory.

LoannGio avatar Nov 25 '20 14:11 LoannGio