xformers
xformers copied to clipboard
importing `xformers.ops` implicitly initializes CUDA context
Currently, importing xformers.ops
will implicitly initializes CUDA context. This has an unpleasant effect that we cannot use the "fork" multi-processing method.
The line of code that initializes CUDA context is as follows:
https://github.com/facebookresearch/xformers/blob/f6637120b58c4b3626b18234f8c0c74c561b8d01/xformers/init.py#L52
Hi, Thanks for reporting this issue. Unfortunately it might be more effort than just this line, as we check for device capabilities in multiple places as well... @fmassa @bottler any idea?
Fixing this would be good for cutting import times.
We need _is_triton_available to be only called when a public function is called, not at import time of public modules. I think we could do that.
It's possible that the commit 737c2e6 just now will fix this.
Hello, confirming this issue is still occurring as we're seeing it locally as well in xlformers.
Possibly this will be okay now after https://github.com/facebookresearch/xformers/commit/be13e229b52d9d0bdf4422be931c67c492b8092f if you set XFORMERS_ENABLE_TRITON=1 ?
Possibly this will be okay now after be13e22 if you set XFORMERS_ENABLE_TRITON=1 ?
Setting this works for me with xformers v0.0.27. Thanks!