finetune-gpt2xl
finetune-gpt2xl copied to clipboard
AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
I try to use your script (gpt2-xl) but I have an error: AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
pip list Package Version
certifi 2021.5.30 charset-normalizer 2.0.4 click 8.0.1 configparser 5.0.2 datasets 1.8.0 deepspeed 0.4.0 dill 0.3.4 docker-pycreds 0.4.0 filelock 3.0.12 fsspec 2021.7.0 gitdb 4.0.7 GitPython 3.1.18 huggingface-hub 0.0.8 idna 3.2 importlib-metadata 4.7.0 joblib 1.0.1 multiprocess 0.70.12.2 ninja 1.10.2 numpy 1.21.2 packaging 21.0 pandas 1.3.2 pathtools 0.1.2 Pillow 8.3.1 pip 21.2.4 promise 2.3 protobuf 3.17.3 psutil 5.8.0 pyarrow 3.0.0 pyparsing 2.4.7 python-dateutil 2.8.2 pytz 2021.1 PyYAML 5.4.1 regex 2021.8.21 requests 2.26.0 sacremoses 0.0.45 sentry-sdk 1.3.1 setuptools 57.4.0 shortuuid 1.0.1 six 1.16.0 smmap 4.0.0 subprocess32 3.5.4 tensorboardX 1.8 tokenizers 0.10.3 torch 1.9.0 torchvision 0.10.0 tqdm 4.49.0 transformers 4.7.0 triton 1.0.0 typing-extensions 3.10.0.0 urllib3 1.26.6 wandb 0.12.0 wheel 0.37.0 xxhash 2.0.2 zipp 3.5.0
without :
"optimizer": { "type": "AdamW", "params": { "lr": "auto", "betas": "auto", "eps": "auto", "weight_decay": "auto" }
in ds_config.json all work it takes 17 min
Same problem
I also occur that. before AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' The error also show: cannot make a dir in /tmp/torch_extensions/build for cpu_adam. So I change the DEFAULT_TORCH_EXTENSION_PATH in the file /anaconda3/envs/XXXXX/lib/python3.6/site-packages/deepspeed/ops/op_builder/builder.py from "/tmp/torch_extensions/" to any path where I have permission to create folders. then it works.
For me I noticed it was exciting on a ['which', 'c++']
eval before AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
In my case, installing / updating g++
successfully resolved the issue for me
In my case installing "cudatoolkit-dev" solved the issue
torch offers different versions for cpu and cuda devices. I removed cpu version and install cuda version as per guidelines here: https://pytorch.org/get-started/locally/
This is what I installed for pip:
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
Additionally there was this intermediate error:
"cannot find libcurand.so.o something"
which was solved by installing:
apt-get install -y libopenblas-base
But make sure, you on Ubuntu 20.04 or higher before installing libopenblas-base.
And that's how my problem was solved!
I think it is the problem with that specific deepspeed version (i.e., 0.4.0) in requirements. In my case, it was solved by upgrading deepspeed. You can upgrade it by using this command pip install -U deepspeed
and it should be fixed.
For me I noticed it was exciting on a
['which', 'c++']
eval beforeAttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
In my case, installing / updating
g++
successfully resolved the issue for me
Thanks. This is effective for me.