finetune-gpt2xl icon indicating copy to clipboard operation
finetune-gpt2xl copied to clipboard

AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'

Open remotejob opened this issue 3 years ago • 8 comments

I try to use your script (gpt2-xl) but I have an error: AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'

pip list Package Version


certifi 2021.5.30 charset-normalizer 2.0.4 click 8.0.1 configparser 5.0.2 datasets 1.8.0 deepspeed 0.4.0 dill 0.3.4 docker-pycreds 0.4.0 filelock 3.0.12 fsspec 2021.7.0 gitdb 4.0.7 GitPython 3.1.18 huggingface-hub 0.0.8 idna 3.2 importlib-metadata 4.7.0 joblib 1.0.1 multiprocess 0.70.12.2 ninja 1.10.2 numpy 1.21.2 packaging 21.0 pandas 1.3.2 pathtools 0.1.2 Pillow 8.3.1 pip 21.2.4 promise 2.3 protobuf 3.17.3 psutil 5.8.0 pyarrow 3.0.0 pyparsing 2.4.7 python-dateutil 2.8.2 pytz 2021.1 PyYAML 5.4.1 regex 2021.8.21 requests 2.26.0 sacremoses 0.0.45 sentry-sdk 1.3.1 setuptools 57.4.0 shortuuid 1.0.1 six 1.16.0 smmap 4.0.0 subprocess32 3.5.4 tensorboardX 1.8 tokenizers 0.10.3 torch 1.9.0 torchvision 0.10.0 tqdm 4.49.0 transformers 4.7.0 triton 1.0.0 typing-extensions 3.10.0.0 urllib3 1.26.6 wandb 0.12.0 wheel 0.37.0 xxhash 2.0.2 zipp 3.5.0

remotejob avatar Aug 26 '21 01:08 remotejob

without :


"optimizer": { "type": "AdamW", "params": { "lr": "auto", "betas": "auto", "eps": "auto", "weight_decay": "auto" }


in ds_config.json all work it takes 17 min

remotejob avatar Aug 26 '21 01:08 remotejob

Same problem

Elfsong avatar Nov 10 '21 18:11 Elfsong

I also occur that. before AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' The error also show: cannot make a dir in /tmp/torch_extensions/build for cpu_adam. So I change the DEFAULT_TORCH_EXTENSION_PATH in the file /anaconda3/envs/XXXXX/lib/python3.6/site-packages/deepspeed/ops/op_builder/builder.py from "/tmp/torch_extensions/" to any path where I have permission to create folders. then it works.

ziweiji avatar Nov 17 '21 04:11 ziweiji

For me I noticed it was exciting on a ['which', 'c++'] eval before AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'

In my case, installing / updating g++ successfully resolved the issue for me

yochrisbolton avatar Nov 26 '21 04:11 yochrisbolton

In my case installing "cudatoolkit-dev" solved the issue

CharanSG avatar Feb 23 '22 22:02 CharanSG

torch offers different versions for cpu and cuda devices. I removed cpu version and install cuda version as per guidelines here: https://pytorch.org/get-started/locally/

This is what I installed for pip: pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113

Additionally there was this intermediate error: "cannot find libcurand.so.o something" which was solved by installing: apt-get install -y libopenblas-base

But make sure, you on Ubuntu 20.04 or higher before installing libopenblas-base.

And that's how my problem was solved!

aniruddhakal avatar Apr 13 '22 17:04 aniruddhakal

I think it is the problem with that specific deepspeed version (i.e., 0.4.0) in requirements. In my case, it was solved by upgrading deepspeed. You can upgrade it by using this command pip install -U deepspeed and it should be fixed.

uahmad235 avatar Sep 16 '22 23:09 uahmad235

For me I noticed it was exciting on a ['which', 'c++'] eval before AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'

In my case, installing / updating g++ successfully resolved the issue for me

Thanks. This is effective for me.

KelleyYin avatar Apr 06 '23 15:04 KelleyYin