python-neo icon indicating copy to clipboard operation
python-neo copied to clipboard

Fully lazy imports of IO submodules

Open CodyCBakerPhD opened this issue 1 year ago • 2 comments

@JuliaSprenger @samuelgarcia

Hello everyone - a recent issue on NeuroConv drew our attention to the fact that global module import structures such as https://github.com/NeuralEnsemble/python-neo/blob/master/neo/init.py#L10-L11 and https://github.com/NeuralEnsemble/python-neo/blob/master/neo/io/init.py#L278-L379 will actively import every single format as long as the dependencies are installed.

Both we and you do, thankfully, have always used lazy importing with respect to conditional installation - module will not be loaded if it is not installed, but will through an error on attempting to utilize the IO.

However it occurred to us that as the number of formats grows, and as any tertiary or deeper dependencies in those other packages grow, the total startup time of the package grows as well as the risk that any one of those deeper dependencies may eventually have some platform, Python version, or other configuration problem (using the above issue as a prime example). It would be nice if formats or other submodule imports only occurred on-demand as it is rare for a single use-case to need immediate access to every single IO (rather, it's more typical that it might only need to access one or two).

We are apparently not alone in thinking this - in progress PEP 690 involves a rigorous discussion about some common importlib tools for this functionality, though existing 5-year old PEP 562 already defines a process for doing this. As such we were looking into implementing __getattr_, __dir__, and __all__ overrides much like scikit-image recently implemented.

Thing is though, if we implemented this at the NeuroConv level we would still not gain the performance benefit since any time we want to use SpikeInterface with a neo backend, the import structure would become non-lazy unless we also propagated those types of changes down to those levels (which I'd be perfectly fine helping with or doing myself).

What are your thoughts on this?

Also @alejoe91, since (I think?) it would have to occur at the SpikeInterface level as well.

CC: @bendichter

CodyCBakerPhD avatar Aug 08 '22 20:08 CodyCBakerPhD

Hi Cody. I totally agree. This is something I wanted to do both in neo and spikeinterface : remove import at module level and move to on demand import in function or method. I will try to do something soon.

samuelgarcia avatar Aug 22 '22 11:08 samuelgarcia

@samuelgarcia Cool, thanks.

Just to fill you in on the rest of the conversation - we ultimately decided against using the SPEC001 lazy_import library or any other workaround via __getattr__/__dir__ modifications at the module level since most IDEs have not adapted their auto-complete methods to allow that standard, so any imports that were made lazy with that approach would not load properly with 'tab-complete'.

So what we decided instead was basically just to move the imports from top-level to inside the function they are called. There is occasional code duplication with that approach, but no loss to overall performance since the first time any package is imported (regardless of global vs. local namespace) it will get added to the sys.modules for faster subsequent access.

An explicit example: Before and after

CodyCBakerPhD avatar Aug 22 '22 13:08 CodyCBakerPhD