ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[BUG]: Build ColossalAI from Source

Open FrankieDong opened this issue 2 years ago • 7 comments

🐛 Describe the bug

When I run the SD2.0, it let me Build ColossalAI from Source. And when I buiild, the error is " RuntimeError: Error building extension 'multihead_attention'"

is that mean something wrong with my env? maybe gcc?

Traceback (most recent call last): File "", line 2, in File "", line 34, in File "/home/zrytest/diffusion/ColossalAI/setup.py", line 6, in from colossalai.kernel.op_builder.utils import get_cuda_bare_metal_version File "/home/zrytest/diffusion/ColossalAI/colossalai/init.py", line 1, in from .initialize import ( File "/home/zrytest/diffusion/ColossalAI/colossalai/initialize.py", line 23, in from colossalai.engine.schedule import NonPipelineSchedule, PipelineSchedule, InterleavedPipelineSchedule, get_tensor_shape File "/home/zrytest/diffusion/ColossalAI/colossalai/engine/init.py", line 1, in from ._base_engine import Engine File "/home/zrytest/diffusion/ColossalAI/colossalai/engine/_base_engine.py", line 10, in from colossalai.gemini.ophooks import register_ophooks_recursively, BaseOpHook File "/home/zrytest/diffusion/ColossalAI/colossalai/gemini/init.py", line 1, in from .chunk import ChunkManager, TensorInfo, TensorState, search_chunk_configuration File "/home/zrytest/diffusion/ColossalAI/colossalai/gemini/chunk/init.py", line 1, in from .chunk import Chunk, ChunkFullError, TensorInfo, TensorState File "/home/zrytest/diffusion/ColossalAI/colossalai/gemini/chunk/chunk.py", line 9, in from colossalai.utils import get_current_device File "/home/zrytest/diffusion/ColossalAI/colossalai/utils/init.py", line 3, in from .checkpointing import load_checkpoint, save_checkpoint File "/home/zrytest/diffusion/ColossalAI/colossalai/utils/checkpointing.py", line 14, in from .common import is_using_pp File "/home/zrytest/diffusion/ColossalAI/colossalai/utils/common.py", line 21, in from colossalai.kernel import fused_optim File "/home/zrytest/diffusion/ColossalAI/colossalai/kernel/init.py", line 19, in multihead_attention = MultiHeadAttnBuilder().load() File "/home/zrytest/diffusion/ColossalAI/colossalai/kernel/op_builder/builder.py", line 59, in load op_module = load(name=self.name, File "/home/zrytest/anaconda3/envs/ldm4/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load return _jit_compile( File "/home/zrytest/anaconda3/envs/ldm4/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1425, in _jit_compile _write_ninja_file_and_build_library( File "/home/zrytest/anaconda3/envs/ldm4/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1537, in _write_ninja_file_and_build_library _run_ninja_build( File "/home/zrytest/anaconda3/envs/ldm4/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension 'multihead_attention'

Environment

No response

FrankieDong avatar Dec 29 '22 00:12 FrankieDong

1 error detected in the compilation of "/tmp/tmpxft_00007af2_00000000-11_cuda_util.compute_37.cpp1.ii". ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/home/zrytest/diffusion/ColossalAI/colossalai/kernel/init.py", line 16, in from colossalai._C import multihead_attention ImportError: cannot import name 'multihead_attention' from 'colossalai._C' (unknown location)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/zrytest/anaconda3/envs/ldm4/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1808, in _run_ninja_build subprocess.run( File "/home/zrytest/anaconda3/envs/ldm4/lib/python3.8/subprocess.py", line 512, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

FrankieDong avatar Dec 29 '22 00:12 FrankieDong

You can directly download ColossalAI from https://www.colossalai.org/download . We use our own pip source, so don't do pip install colossalai to install from the public PyPI.

If you have issues after downloading from the website, you can use colossalai check -i to verify your installation. You should make sure you cuda/torch/colossalai version match.

We are working on to build only the needed kernels during runtime and gonna release it soon.

FrankLeeeee avatar Dec 29 '22 01:12 FrankLeeeee

did you install ninja? It is in requirements.txt

feifeibear avatar Dec 29 '22 01:12 feifeibear

did you install ninja? It is in requirements.txt

Of course. And maybe I find the problem, I try it

FrankieDong avatar Dec 29 '22 06:12 FrankieDong

@FrankLeeeee Have you solved it? I also meet this problem.

haofanwang avatar Jan 05 '23 10:01 haofanwang

@FrankLeeeee Have you solved it? I also meet this problem.

this problem occured because I don‘t pull right version from the github.

FrankieDong avatar Jan 06 '23 00:01 FrankieDong

@FrankieDong What do you mean by the right version? Is it the main branch by default?

haofanwang avatar Jan 06 '23 07:01 haofanwang

We have updated a lot. This issue was closed due to inactivity. Thanks. https://github.com/hpcaitech/ColossalAI#Installation

binmakeswell avatar Apr 14 '23 08:04 binmakeswell