xformers importing `xformers.ops` implicitly initializes CUDA context

importing `xformers.ops` implicitly initializes CUDA context

Open function2-llx opened this issue 10 months ago • 6 comments

Currently, importing xformers.ops will implicitly initializes CUDA context. This has an unpleasant effect that we cannot use the "fork" multi-processing method.

The line of code that initializes CUDA context is as follows:

https://github.com/facebookresearch/xformers/blob/f6637120b58c4b3626b18234f8c0c74c561b8d01/xformers/init.py#L52

Apr 20 '24 13:04 function2-llx

Hi, Thanks for reporting this issue. Unfortunately it might be more effort than just this line, as we check for device capabilities in multiple places as well... @fmassa @bottler any idea?

Apr 24 '24 12:04 danthe3rd

Fixing this would be good for cutting import times.

We need _is_triton_available to be only called when a public function is called, not at import time of public modules. I think we could do that.

May 02 '24 13:05 bottler

It's possible that the commit 737c2e6 just now will fix this.

May 07 '24 15:05 bottler

Hello, confirming this issue is still occurring as we're seeing it locally as well in xlformers.

Jun 18 '24 15:06 LucasLLC

Possibly this will be okay now after https://github.com/facebookresearch/xformers/commit/be13e229b52d9d0bdf4422be931c67c492b8092f if you set XFORMERS_ENABLE_TRITON=1 ?

Jun 25 '24 16:06 bottler

Possibly this will be okay now after be13e22 if you set XFORMERS_ENABLE_TRITON=1 ?

Setting this works for me with xformers v0.0.27. Thanks!

Jul 16 '24 06:07 function2-llx

xformers xformers copied to clipboard

importing `xformers.ops` implicitly initializes CUDA context

xformers
xformers copied to clipboard