ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[FEATURE]: Build kernel only when executed

Open FrankLeeeee opened this issue 2 years ago • 1 comments

Describe the feature

In the current Colossal-AI implementation, we build Colossal-AI in two ways:

  1. built when doing CUDA_EXT=1 pip install colossalai
  2. build the CUDA kernel when importing Colossal-AI

However, building all kernels at least takes 5 min, which is too long for many users. Moreover, not all kernels are in fact required by a user. Therefore, only build the kernel needed for the current program would be the best to balance user experience and the completeness of the library.

FrankLeeeee avatar Jan 05 '23 07:01 FrankLeeeee

update it seems that only CUDA_EXT=1 pip install colossalai will work. If I install without CUDA_EXT=1, there will be errors. Previously, I try to import when colossalai is in PWD, and it start to build the extension. If I import from other place, it just raise errors.


Please at least add the instructions to the readme file. I got the error below

ImportError: cannot import name 'fused_optim' from 'colossalai._C' 

I did not find any description of this. Only when I happen to type import colossalai and it starts to build extensions, I think I have wasted too much time.

flymin avatar Jan 05 '23 15:01 flymin

Hi @flymin , I have set up a PR #2374 to enable runtime build to reduce the frustration during installation. Hope it can help.

FrankLeeeee avatar Jan 06 '23 08:01 FrankLeeeee