ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[BUG]: AttributeError: type object 'ColoParameter' has no attribute 'from_torch_tensor' when run hybrid_parallel example

Open ztorchan opened this issue 10 months ago • 3 comments

🐛 Describe the bug

I noticed that from_torch_tensor method of class ColoParameter and ColoTensor have been removed in PR #4479 (colossalai/tensor/colo_parameter.py, colossalai/tensor/colo_tensor.py).

But this method was still called under file ColossalAI/colossalai/legacy/pipeline/pipelinable.py https://github.com/hpcaitech/ColossalAI/blob/641b1ee71a19e2337f3363620b228dd355835b04/colossalai/legacy/pipeline/pipelinable.py#L120

which cause error: AttributeError: type object 'ColoParameter' has no attribute 'from_torch_tensor'

Environment

python 3.10.10 cuda 11.6.1 torch 1.12.1

ztorchan avatar Apr 08 '24 12:04 ztorchan

The legacy directory is deprecated now. Some code that uses dependencies from legacy is out-of-date. You can try to refer to the new implementations.

char-1ee avatar Apr 08 '24 16:04 char-1ee

@char-1ee When I use the latest version of Colossalai, I get an error "ModuleNotFoundError: No module named 'colossalai.context.moe_context'". I don't know which version of Colossalai I should use. Do you have any good suggestions?

awer-A avatar Apr 09 '24 02:04 awer-A

@awer-A You can try install colossalai from source. I suggest if you want to set up a distributed training environment with ColossalAI on your server, you can follow the instructions in ColossalAI's docker file https://github.com/hpcaitech/ColossalAI/blob/main/docker/Dockerfile, step by step.

char-1ee avatar Apr 09 '24 05:04 char-1ee