DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

[BUG] import deepspeed error when building from source

Open kisseternity opened this issue 2 years ago • 4 comments

In the latest release v0.6.5, I build from source using pip install . in the DeepSpeed directory, installed successfully. However, when I use it with Megatron, import deepspeed reports error as shown: Traceback (most recent call last): File "pretrain_bert.py", line 23, in import deepspeed File "/opt/conda/lib/python3.8/site-packages/deepspeed/init.py", line 13, in from . import ops File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/init.py", line 1, in from . import adam File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/adam/init.py", line 1, in from .cpu_adam import DeepSpeedCPUAdam File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/adam/cpu_adam.py", line 9, in from ..op_builder import CPUAdamBuilder ModuleNotFoundError: No module named 'deepspeed.ops.op_builder' from megatron import get_args File "/home/notebook/code/personal/bert/Megatron-DeepSpeed-v3.0/megatron/init.py", line 17, in from .global_vars import get_args File "/home/notebook/code/personal/bert/Megatron-DeepSpeed-v3.0/megatron/global_vars.py", line 27, in from .arguments import parse_args File "/home/notebook/code/personal/bert/Megatron-DeepSpeed-v3.0/megatron/arguments.py", line 22, in import deepspeed File "/opt/conda/lib/python3.8/site-packages/deepspeed/init.py", line 13, in from . import ops File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/init.py", line 1, in from . import adam File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/adam/init.py", line 1, in from .cpu_adam import DeepSpeedCPUAdam File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/adam/cpu_adam.py", line 9, in from ..op_builder import CPUAdamBuilder ModuleNotFoundError: No module named 'deepspeed.ops.op_builder'

Is this a bug? Image_20220628173023

kisseternity avatar Jun 28 '22 09:06 kisseternity

@kisseternity Are you able to run if you install from pypi (pip install deepspeed==0.6.5)?

I've seen a similar issue in the past and it was likely due to the symlinks we create in setup.py - but I haven't been able to reproduce it in any system / environment I have access to.

mrwyattii avatar Jun 28 '22 16:06 mrwyattii

@kisseternity Are you able to run if you install from pypi (pip install deepspeed==0.6.5)?

I've seen a similar issue in the past and it was likely due to the symlinks we create in setup.py - but I haven't been able to reproduce it in any system / environment I have access to.

I've tried the v0.6.5 release version and it runs fine. The description before was wrong, actually I used the master code. Is there something different between them when installing? I think the reason is that after installing, two folders named op_builder and csrc are missing. As shown below. Release version install image

Master code install image

kisseternity avatar Jun 29 '22 01:06 kisseternity

@kisseternity thanks for sharing! I think the issue is stemming from symbolic links we create in setup.py - but I'm unable to reproduce this behavior in my own environment. Could you share what OS you are using and what shell you use?

If you are running linux, you can share the output of:

lsb_release -a
echo $SHELL && $SHELL --version

mrwyattii avatar Jun 29 '22 16:06 mrwyattii

@kisseternity thanks for sharing! I think the issue is stemming from symbolic links we create in setup.py - but I'm unable to reproduce this behavior in my own environment. Could you share what OS you are using and what shell you use?

If you are running linux, you can share the output of:

lsb_release -a
echo $SHELL && $SHELL --version

FYI image

kisseternity avatar Jun 30 '22 03:06 kisseternity

Hi @kisseternity - this issue seems fairly stale now so I'll close it, mostly since we're now on DeepSpeed 0.10.0. If are you able, could you see if this is an issue you still hit given all the changes we've made (and pip has made) since this time?

If you still have any remaining issues, please either re-open this issue or open a new one and link this and we will prioritize this.

loadams avatar Aug 16 '23 14:08 loadams

May I ask how to solve the problem "ModuleNotFoundError: No module named 'deepspeed.op. op_builder'"?

wuQi-666 avatar Oct 22 '23 02:10 wuQi-666

@wuQi-666 - that usually means that you have not installed deepspeed properly. Can you please share pip list and python -c "import deepspeed; print('op_builder:', deepspeed.op.op_builder)"

loadams avatar Oct 23 '23 16:10 loadams