Import time
Hello,
pynndescent takes ages to load, most likely because of heavy usage of numba jit. Would it be possible to use cache or AOT compilation?
In[1]: %time import pynndescent
Wall time: 17.9 s
In[2]: pynndescent.__version__
Out[2]: '0.5.1'
Ahead of time compilation creates some packaging issues. Perhaps it could be done, but it is beyond what I can manage right now (I would welcome any help).
I have tried caching, but it breaks for several functions, and produced issues for others, so it is avoided for a lot of cases.
I certainly understand the concerns. It is a long import time. Unfortunately I am not sure I can do much to remedy it right now.
I just want to second this issue. It would be very helpful if something could be done.
This makes this excellent package (as well as others reliant on it - like umap) unusable in many production applications.
I now have two independent examples where using pynndescent would solve my problems, but can't deploy it because the import is too slow...
This is not a complaint, I'm thankful for the package, it's just pity I can't use it in practice...
It looks like very aggressive application of cache=True to all the top level numba jitted functions in the various submodules does a good job of alleviating a lot of this. Obviously I still need to check that caching is not going to cause other issues, but perhaps a solution might be forthcoming soon.
It’s still very slow compared to a lot of other packages. I can’t say I’d expect this to make umap “unusable in many production applications”, as that just means you should use a long running process instead of spawning a new python process for every call.
But it certainly hurts when you want to import a dependent package for something unrelated, e.g. for test collection.
This is how importing scanpy looked before I noticed and fixed it: