graspologic icon indicating copy to clipboard operation
graspologic copied to clipboard

[BUG] graspologic takes 33 seconds to import

Open loftusa opened this issue 1 year ago • 9 comments

Problem

Graspologic is taking an extremely long time to import for me. This is after a fresh pip install --upgrade graspologic. (Also had to pip install --upgrade numba and pip install --upgrade numpy to get it to import)

I timed it and it looks like it takes around 33 seconds, and importing it also gives some strange umap numba warning.

Screenshot 2023-05-20 at 9 19 02 PM

Example Code

Please see How to create a Minimal, Reproducible example for some guidance on creating the best possible example of the problem

from time import time
start = time()
import graspologic
end = time()

print(end - start)

Full Traceback

/usr/local/lib/python3.9/site-packages/umap_learn-0.5.3-py3.9.egg/umap/distances.py:1063: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @numba.jit()
/usr/local/lib/python3.9/site-packages/umap_learn-0.5.3-py3.9.egg/umap/distances.py:1071: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @numba.jit()
/usr/local/lib/python3.9/site-packages/umap_learn-0.5.3-py3.9.egg/umap/distances.py:1086: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @numba.jit()
/usr/local/lib/python3.9/site-packages/umap_learn-0.5.3-py3.9.egg/umap/umap_.py:660: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @numba.jit()
/usr/local/lib/python3.9/site-packages/graspologic/models/edge_swaps.py:215: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  _edge_swap_numba = nb.jit(_edge_swap)

Your Environment

  • Python version: 3.9.13
  • graspologic version: 3.0.0

Additional Details

This is in the graphstatsbook docker container with 7 cpus allocated and about 2/3 of my RAM on a 2022 macbook air with m2 chip.

loftusa avatar May 21 '23 04:05 loftusa

I don't plan on working on this, but if anyone wants to speed things up, go for it.

I'd also just note that importing a specific function or class is usually pretty quick

bdpedigo avatar May 22 '23 14:05 bdpedigo

I did some light profiling on this with python -X importtime -c 'import graspologic -- here's what came up. import_times.txt

loftusa avatar May 23 '23 15:05 loftusa

@bdpedigo I looked at this a bit more just now using tuna. here's the import profile for graspologic:

Screenshot 2023-12-07 at 9 18 44 AM

appears to be mainly the umap import in graspologic.layouts.auto and ot in graspologic.align.seedless_procrustes

loftusa avatar Dec 07 '23 17:12 loftusa

that's interesting! and a cool tool/visualization

im open to discussing proposed fixes, i just dont really know what could be done here, since those other libraries are out of our controll

i can tell you that i dont think we use anything under ot.backend.tensorflow ot ot.backend.torch... so if there's some way to turn off those imports perhaps that could be a big save?

bdpedigo avatar Dec 07 '23 20:12 bdpedigo

i wonder why the load time is so much shorter for tuna than you, though?

bdpedigo avatar Dec 07 '23 20:12 bdpedigo

i wonder why the load time is so much shorter for tuna than you, though?

no clue, I noticed that too, how long does it take for you?

that's interesting! and a cool tool/visualization

im open to discussing proposed fixes, i just dont really know what could be done here, since those other libraries are out of our controll

i can tell you that i dont think we use anything under ot.backend.tensorflow ot ot.backend.torch... so if there's some way to turn off those imports perhaps that could be a big save?

throw imports inside of functions maybe? makes those functions take "longer" to run, but shorter for anybody who just wants to import the package

loftusa avatar Dec 11 '23 20:12 loftusa

https://github.com/PythonOT/POT/issues/516 i wonder to what extent your issue is related to this? what version of POT are you on? it sounds like the root cause is tensorflow, do you have tensorflow installed in this environment?

bdpedigo avatar Dec 11 '23 21:12 bdpedigo

i guess another question - is there a reason you are needing to import all of graspologic, if you're saying you dont want some of these functions? might be much faster to just import the function(s) you need

bdpedigo avatar Dec 11 '23 21:12 bdpedigo

i wonder why the load time is so much shorter for tuna than you, though?

no clue, I noticed that too, how long does it take for you?

that's interesting! and a cool tool/visualization im open to discussing proposed fixes, i just dont really know what could be done here, since those other libraries are out of our controll i can tell you that i dont think we use anything under ot.backend.tensorflow ot ot.backend.torch... so if there's some way to turn off those imports perhaps that could be a big save?

throw imports inside of functions maybe? makes those functions take "longer" to run, but shorter for anybody who just wants to import the package

does this import stick around? are you paying the cost only the first time? if so, this seems totally reasonable to me, but if you add 33 seconds every time you try to save your graph layout, it's going to be a bit wonky. doesn't mean there won't be other ways to fix it, just that this specific one may not work.

daxpryce avatar Jan 23 '24 21:01 daxpryce