[BUG]: ModuleNotFoundError: No module named 'colossalai.nn.optimizer.zero_optimizer'
🐛 Describe the bug
I install colossalAI with the command pip install colossalai==0.1.11rc3+torch1.12cu11.3 -f https://release.colossalai.org
But I get an error when follow https://github.com/hpcaitech/ColossalAI/tree/main/examples/tutorial#-run-opt-finetuning-and-inference, I just run bash ./run_clm_synthetic.sh and get an error as follows:
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/he.yan/ColossalAI/examples/tutorial/opt/opt/run_clm.py:46 in <module> │
│ │
│ 43 from colossalai.core import global_context as gpc │
│ 44 from colossalai.logging import disable_existing_loggers, get_dist_logger │
│ 45 from colossalai.nn.optimizer import HybridAdam │
│ ❱ 46 from colossalai.nn.optimizer.zero_optimizer import ZeroOptimizer │
│ 47 from colossalai.nn.parallel import ZeroDDP │
│ 48 from colossalai.tensor import ProcessGroup │
│ 49 from colossalai.utils import get_current_device, get_dataloader │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ModuleNotFoundError: No module named 'colossalai.nn.optimizer.zero_optimizer'
Environment
Python 3.8.15 torch1.12cu11.3
Hi, @heya5 , that's kind of strange, let me try to reproduce this error.
I think your latest examples import colossalai.nn.optimizer.zero_optimizer, but your code doesn't have it...
By the way, may I ask if the opt model in your library implements model parallelism (tensor parallelim or pipeline parallelism)?
Yes, the release seems to have some problems. I am initiating a new release, you should expect to download the correct version by the end of today. Meanwhile, the opt model does not have tensor parallelism and pipeline parallelism as it is implemented by huggingface.
Hi @heya5 , a new patch has been released, you can download the v0.1.11rc4 version from our website https://www.colossalai.org/download. I have tested the tutorial and it worked fine. Let me know if you encounter further issues.
# example
pip install colossalai==0.1.11rc4+torch1.12cu11.3 -f https://release.colossalai.org