sgkit icon indicating copy to clipboard operation
sgkit copied to clipboard

`import sgkit` takes ~1.5s

Open benjeffery opened this issue 3 years ago • 7 comments

For tools using the CLI this amount of delay feels excessive. Around 1s of this time is performing imports. Here's the import flame graph: (0.1s on xarray.tutorial?!?)

sgkit-import-profile

I assume some of these could be imported only when they are needed - although given that many of the imports are referenced in typing specifications that might not be possible.

I'm not sure yet what the remaining 0.5s is - a cProfile callgraph is pretty useless on import but I think there is a way around that by profiling the individual sgkit files.

benjeffery avatar Oct 14 '22 09:10 benjeffery

I agree that deferring some of the imports could help.

I think numba compilation is still a significant part of this, see #363, although we now have numba caching on so it's faster the second time. Not sure if we could defer this until it's needed too.

tomwhite avatar Oct 14 '22 09:10 tomwhite

Thanks @tomwhite I'll check the numba compilation. I also wonder if PEP484 forward references might help with delaying imports: https://legacy.python.org/dev/peps/pep-0484/#forward-references

benjeffery avatar Oct 14 '22 12:10 benjeffery

I also wonder if PEP484 forward references might help with delaying imports: https://legacy.python.org/dev/peps/pep-0484/#forward-references

It would be great if we could take advantage of forward references.

Do you think xarray itself is doing unnecessary imports - like the tutorial?

tomwhite avatar Oct 17 '22 11:10 tomwhite

Do you think xarray itself is doing unnecessary imports - like the tutorial?

Yes, planning on checking and raising a PR/issue upstream, unless it looks like a rabbit-hole!

benjeffery avatar Oct 17 '22 11:10 benjeffery

Also see https://github.com/pydata/xarray/issues/6726, seems they are aware of the issue.

benjeffery avatar Oct 17 '22 11:10 benjeffery

If we're importing some things just for typing purposes, then I'd be +1 for making the typing less strict.

jeromekelleher avatar Oct 17 '22 12:10 jeromekelleher

Pandas is not used very much in the codebase, so it might be possible to import it lazily.

Similarly, the distance API is pretty niche so making that lazier would be good too.

tomwhite avatar Oct 24 '22 15:10 tomwhite