ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[BUG]: cannot import name 'CpuAdamArmExtension' from 'colossalai.kernel.extensions' (unknown location)

Open Fence opened this issue 10 months ago • 1 comments

Is there an existing issue for this bug?

  • [x] I have searched the existing issues

The bug has not been fixed in the latest main branch

  • [x] I have checked the latest main branch

Do you feel comfortable sharing a concise (minimal) script that reproduces the error? :)

Yes, I will share a minimal reproducible script.

🐛 Describe the bug

I install the environment following "https://github.com/hpcaitech/ColossalAI/tree/main/applications/ColossalChat#install-the-environment"

with the latest main branch, colossalai 0.4.8, 28.02.2025

Firstly I encountered the bug the same as "https://github.com/hpcaitech/ColossalAI/issues/5458"

I tried `ln -s ../../extensions .' inside the colossalai.kernel folder.

But encountered a new problem

"ImportError: cannot import name 'CpuAdamArmExtension' from 'colossalai.kernel.extensions' (unknown location)"

My script is

colossalai run --hostfile path-to-host-file --nproc_per_node 8 lora_finetune.py --pretrained path-to-DeepSeek-R1-bf16 --dataset path-to-dataset.jsonl --plugin moe --lr 2e-5 --max_length 256 -g --ep 8 --pp 3 --batch_size 24 --lora_rank 8 --lora_alpha 16 --num_epochs 2 --warmup_steps 8 --tensorboard_dir logs --save_dir DeepSeek-R1-bf16-lora

Full log is:

Traceback (most recent call last):
  File "/mnt/code/ColossalAI/applications/ColossalChat/examples/training_scripts/lora_finetune.py", line 23, in <module>
    from colossalai.booster import Booster
  File "/opt/conda/envs/colossal-chat/lib/python3.10/site-packages/colossalai/booster/__init__.py", line 2, in <module>
    from .booster import Booster
  File "/opt/conda/envs/colossal-chat/lib/python3.10/site-packages/colossalai/booster/booster.py", line 27, in <module>
    from .plugin import Plugin
  File "/opt/conda/envs/colossal-chat/lib/python3.10/site-packages/colossalai/booster/plugin/__init__.py", line 1, in <module>
    from .gemini_plugin import GeminiPlugin
  File "/opt/conda/envs/colossal-chat/lib/python3.10/site-packages/colossalai/booster/plugin/gemini_plugin.py", line 31, in <module>
    from colossalai.shardformer import ShardConfig, ShardFormer
  File "/opt/conda/envs/colossal-chat/lib/python3.10/site-packages/colossalai/shardformer/__init__.py", line 1, in <module>
    from .shard import GradientCheckpointConfig, ModelSharder, PipelineGradientCheckpointConfig, ShardConfig, ShardFormer
  File "/opt/conda/envs/colossal-chat/lib/python3.10/site-packages/colossalai/shardformer/shard/__init__.py", line 3, in <module>
    from .sharder import ModelSharder
  File "/opt/conda/envs/colossal-chat/lib/python3.10/site-packages/colossalai/shardformer/shard/sharder.py", line 10, in <module>
    from ..policies.auto_policy import get_autopolicy
  File "/opt/conda/envs/colossal-chat/lib/python3.10/site-packages/colossalai/shardformer/policies/auto_policy.py", line 6, in <module>
    from .base_policy import Policy
  File "/opt/conda/envs/colossal-chat/lib/python3.10/site-packages/colossalai/shardformer/policies/base_policy.py", line 13, in <module>
    from ..layer.normalization import BaseLayerNorm
  File "/opt/conda/envs/colossal-chat/lib/python3.10/site-packages/colossalai/shardformer/layer/__init__.py", line 2, in <module>
    from .attn import AttnMaskType, ColoAttention, RingAttention, get_pad_info
  File "/opt/conda/envs/colossal-chat/lib/python3.10/site-packages/colossalai/shardformer/layer/attn.py", line 11, in <module>
    from colossalai.kernel.kernel_loader import (
  File "/opt/conda/envs/colossal-chat/lib/python3.10/site-packages/colossalai/kernel/kernel_loader.py", line 4, in <module>
    from .extensions import (
ImportError: cannot import name 'CpuAdamArmExtension' from 'colossalai.kernel.extensions' (unknown location)

Environment

GPU: A100

cuda12.3

docker.xuanyuan.me/hpcaitech/colossalai:latest

Fence avatar Feb 28 '25 09:02 Fence

Hello, please delete ~/.cache/colossalai, and then reinstall ColossalAI using pip install -e . .

flybird11111 avatar Aug 12 '25 05:08 flybird11111