teneto icon indicating copy to clipboard operation
teneto copied to clipboard

MemoryError while loading graph

Open GiulioRossetti opened this issue 4 years ago • 14 comments

If I try to load a relatively small network (~86980 edges) composed of three snapshots I get the following error:

MemoryError: Unable to allocate array with shape (110011, 110011, 3) and data type float64

I have the same behavior loading an edgelist as well as a pandas DataFrame.

Is it a known scalability issue or is there something that I am missing? If so, what's the network size that teneto is able to handle?

GiulioRossetti avatar Jun 05 '20 14:06 GiulioRossetti

Unfortunately we have some speed/size issues that I just havn't had time to address as this is more of a hobby project at the moment. Most of the functions are designed to work with 1000x1000x1000 size networks.

I am assuming you are trying to use the TemporalNetwork class?

If yes, you can try the hdf5 flag. You may hit a MemoryError a little later, depending on what you try and do as not everything is hdf5 optimized. Or you may hit some speed issues. But, hopefully, that will work.

wiheto avatar Jun 05 '20 15:06 wiheto

Thanks for your quick reply! Yes I'm using the TemporalNetwork class: I'll try with the hdf5 flag, thank for the suggestion!

GiulioRossetti avatar Jun 05 '20 15:06 GiulioRossetti

No problem. Let me know how it goes.

If not, I will make sure we get that networksize covered when I get around to rewriting the core of teneto (but finding time is the problem)

wiheto avatar Jun 05 '20 15:06 wiheto

Unfortunately, setting the flag has no effect.

GiulioRossetti avatar Jun 05 '20 16:06 GiulioRossetti

Could you explain a little more (just for my understanding). Did it fail to load the network or run a function after it was loaded? If the latter, which function.

wiheto avatar Jun 05 '20 19:06 wiheto

Still at loading stage.

GiulioRossetti avatar Jun 05 '20 19:06 GiulioRossetti

And the network is dense (i.e. few edges are 0)?

wiheto avatar Jun 05 '20 19:06 wiheto

The network is not particularly dense ~87k directed edges for 8k nodes: approx density of the static graph 0.001

GiulioRossetti avatar Jun 05 '20 19:06 GiulioRossetti

Thanks. Could you send the complete error message (i.e. the ca 10-20 lines above the MemoryError). It will help me isolate the memory hogging process.

Sorry about this.

wiheto avatar Jun 05 '20 20:06 wiheto

Don't worry, I'll send you the whole stacktrace first thing tomorrow morning!

GiulioRossetti avatar Jun 05 '20 20:06 GiulioRossetti

Here it is:

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-76-bd3519b38063> in <module>
----> 1 tnet = TemporalNetwork(from_edgelist=edges, hdf5=True, hdf5path="data/", diagonal=False)
      2 tnet.network

~/anaconda3/lib/python3.7/site-packages/teneto/classes/network.py in __init__(self, N, T, nettype, from_df, from_array, from_dict, from_edgelist, timetype, diagonal, timeunit, desc, starttime, nodelabels, timelabels, hdf5, hdf5path, forcesparse)
    139             self.network_from_df(from_df)
    140         if from_edgelist is not None:
--> 141             self.network_from_edgelist(from_edgelist)
    142         elif from_array is not None:
    143             self.network_from_array(from_array, forcesparse=forcesparse)

~/anaconda3/lib/python3.7/site-packages/teneto/classes/network.py in network_from_edgelist(self, edgelist)
    264             colnames = ['i', 'j', 't']
    265         self.network = pd.DataFrame(edgelist, columns=colnames)
--> 266         self._update_network()
    267 
    268     def network_from_dict(self, contact):

~/anaconda3/lib/python3.7/site-packages/teneto/classes/network.py in _update_network(self)
    226         """Helper function that updates the network info"""
    227         self._calc_netshape()
--> 228         self._set_nettype()
    229         if self.nettype:
    230             if self.nettype[1] == 'u':

~/anaconda3/lib/python3.7/site-packages/teneto/classes/network.py in _set_nettype(self)
    179             self.nettype = 'xu'
    180             G1 = teneto.utils.df_to_array(
--> 181                 self.network, self.netshape, self.nettype)
    182             self.nettype = 'xd'
    183             G2 = teneto.utils.df_to_array(

~/anaconda3/lib/python3.7/site-packages/teneto/utils/utils.py in df_to_array(df, netshape, nettype)
    764     if len(df) > 0:
    765         idx = np.array(list(map(list, df.values)))
--> 766         tnet = np.zeros([netshape[0], netshape[0], netshape[1]])
    767         if idx.shape[1] == 3:
    768             if nettype[-1] == 'u':

MemoryError: Unable to allocate array with shape (110011, 110011, 2) and data type float64

GiulioRossetti avatar Jun 06 '20 08:06 GiulioRossetti

Thanks.

Try adding:

forcesparse=True

6 juni 2020 kl. 10:07 skrev Giulio Rossetti [email protected]:

forcesparse

wiheto avatar Jun 06 '20 08:06 wiheto

Tried: same stacktrace.

GiulioRossetti avatar Jun 06 '20 08:06 GiulioRossetti

Thanks. I will look into this.

6 juni 2020 kl. 10:28 skrev Giulio Rossetti [email protected]:

 Tried: same stacktrace.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

wiheto avatar Jun 06 '20 08:06 wiheto