ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[BUG]: I can not install colossalAI

Open shendl1978 opened this issue 2 years ago • 5 comments

🐛 Describe the bug

I build colossalAI from source code, but can not install it correctly.

Then I use pip install colossalai to install. but I still install it uncorrectly.

colossalai check -i Traceback (most recent call last): File "/usr/local/bin/colossalai", line 5, in from colossalai.cli import cli File "/usr/local/lib/python3.10/dist-packages/colossalai/init.py", line 1, in from .initialize import ( File "/usr/local/lib/python3.10/dist-packages/colossalai/initialize.py", line 18, in from colossalai.amp import AMP_TYPE, convert_to_amp File "/usr/local/lib/python3.10/dist-packages/colossalai/amp/init.py", line 9, in from .torch_amp import convert_to_torch_amp File "/usr/local/lib/python3.10/dist-packages/colossalai/amp/torch_amp/init.py", line 9, in from .torch_amp import TorchAMPLoss, TorchAMPModel, TorchAMPOptimizer File "/usr/local/lib/python3.10/dist-packages/colossalai/amp/torch_amp/torch_amp.py", line 10, in from colossalai.nn.optimizer import ColossalaiOptimizer File "/usr/local/lib/python3.10/dist-packages/colossalai/nn/init.py", line 1, in from ._ops import * File "/usr/local/lib/python3.10/dist-packages/colossalai/nn/_ops/init.py", line 1, in from .addmm import colo_addmm File "/usr/local/lib/python3.10/dist-packages/colossalai/nn/_ops/addmm.py", line 5, in from ._utils import GeneralTensor, Number, convert_to_colo_tensor File "/usr/local/lib/python3.10/dist-packages/colossalai/nn/_ops/_utils.py", line 8, in from colossalai.nn.layer.utils import divide File "/usr/local/lib/python3.10/dist-packages/colossalai/nn/layer/init.py", line 1, in from .colossalai_layer import * File "/usr/local/lib/python3.10/dist-packages/colossalai/nn/layer/colossalai_layer/init.py", line 2, in from .dropout import Dropout File "/usr/local/lib/python3.10/dist-packages/colossalai/nn/layer/colossalai_layer/dropout.py", line 4, in from ..parallel_1d import * File "/usr/local/lib/python3.10/dist-packages/colossalai/nn/layer/parallel_1d/init.py", line 1, in from .layers import (Classifier1D, Dropout1D, Embedding1D, LayerNorm1D, Linear1D, Linear1D_Col, Linear1D_Row, File "/usr/local/lib/python3.10/dist-packages/colossalai/nn/layer/parallel_1d/layers.py", line 17, in from colossalai.kernel import LayerNorm File "/usr/local/lib/python3.10/dist-packages/colossalai/kernel/init.py", line 1, in from .cuda_native import FusedScaleMaskSoftmax, LayerNorm, MultiHeadAttention File "/usr/local/lib/python3.10/dist-packages/colossalai/kernel/cuda_native/init.py", line 1, in from .layer_norm import MixedFusedLayerNorm as LayerNorm File "/usr/local/lib/python3.10/dist-packages/colossalai/kernel/cuda_native/layer_norm.py", line 12, in from colossalai.kernel.op_builder.layernorm import LayerNormBuilder ModuleNotFoundError: No module named 'colossalai.kernel.op_builder'

Environment

Ubuntu22.04

shendl1978 avatar Feb 18 '23 01:02 shendl1978

Seems this issue https://github.com/hpcaitech/ColossalAI/issues/2771 would help you.

Gy-Lu avatar Feb 18 '23 06:02 Gy-Lu

Same in Windows 11. I built from source code and I've followed #2771 to not execute from root directory.

NKcqx avatar Feb 18 '23 17:02 NKcqx

@NKcqx did you use WSL on windows?

FrankLeeeee avatar Feb 21 '23 07:02 FrankLeeeee

@shendl1978 is it possible to reproduce in docker?

FrankLeeeee avatar Feb 21 '23 07:02 FrankLeeeee

Several users met the same problem, but it works fine on all of our machines. If this can be reproduced in Docker, we can directly build the same environment and try to look for the bug. That would significantly speed up the process. @shendl1978 @NKcqx

FrankLeeeee avatar Feb 21 '23 07:02 FrankLeeeee

Hi @NKcqx @shendl1978 We have updated a lot. Please check the latest instructions https://github.com/hpcaitech/ColossalAI#Installation This issue was closed due to inactivity. Thanks.

binmakeswell avatar Apr 19 '23 10:04 binmakeswell