DeepSpeed TypeError: getattr(): attribute name must be string

Describe the bug When I was trying a initialize the model with deepspeed, the following bug appears, and it appears in whatever situation when I want to use deepspeed on my current machine. It basically reports that the name of the optimizer must be a string, but I didn't even specify an optimizer and the default optimizer must be used. This is totally unacceptable and surprising to me to find that such a standard piece of code that comes from a tutorial could fail to run.

To Reproduce Steps to reproduce the behavior:

use any model, any training_data, specify the deepspeed_config, and run the following line of code model_engine, optimizer, trainloader, __ = deepspeed.initialize( args=args, model=net, model_parameters=parameters, training_data=trainset)
see the following traceback

Traceback (most recent call last): File "//Transformers/test/111.py", line 86, in model_engine, optimizer, trainloader, __ = deepspeed.initialize( File "/home//miniconda3/envs/Torch3.8/lib/python3.8/site-packages/deepspeed/init.py", line 120, in initialize engine = DeepSpeedEngine(args=args, File "/home//miniconda3/envs/Torch3.8/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 294, in init self._configure_optimizer(optimizer, model_parameters) File "/home//miniconda3/envs/Torch3.8/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1098, in _configure_optimizer basic_optimizer = self._configure_basic_optimizer(model_parameters) File "/home/***/miniconda3/envs/Torch3.8/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1229, in _configure_basic_optimizer torch_optimizer = getattr(torch.optim, self.optimizer_name()) TypeError: getattr(): attribute name must be string

Expected behavior deepspeed engine load successfully

ds_report output

DeepSpeed C++/CUDA extension op report

NOTE: Ops not installed will be just-in-time (JIT) compiled at runtime if needed. Op compatibility means that your system meet the required dependencies to JIT install the op.

JIT compiled ops requires ninja ninja .................. [OKAY]

op name ................ installed .. compatible

cpu_adam ............... [NO] ....... [OKAY] cpu_adagrad ............ [NO] ....... [OKAY] fused_adam ............. [NO] ....... [OKAY] fused_lamb ............. [NO] ....... [OKAY] [WARNING] please install triton==1.0.0 if you want to use sparse attention sparse_attn ............ [NO] ....... [NO] transformer ............ [NO] ....... [OKAY] stochastic_transformer . [NO] ....... [OKAY] [WARNING] async_io requires the dev libaio .so object and headers but these were not found. [WARNING] async_io: please install the libaio-dev package with apt [WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found. async_io ............... [NO] ....... [NO] utils .................. [NO] ....... [OKAY] quantizer .............. [NO] ....... [OKAY] transformer_inference .. [NO] ....... [OKAY]

DeepSpeed general environment info: torch install path ............... ['/home/hhchen/miniconda3/envs/Torch3.8/lib/python3.8/site-packages/torch'] torch version .................... 1.12.1 torch cuda version ............... 11.3 torch hip version ................ None nvcc version ..................... 11.1 deepspeed install path ........... ['/home/hhchen/miniconda3/envs/Torch3.8/lib/python3.8/site-packages/deepspeed'] deepspeed info ................... 0.6.5, unknown, unknown deepspeed wheel compiled w. ...... torch 1.12, cuda 11.3

Screenshots If applicable, add screenshots to help explain your problem.

System info (please complete the following information):

OS: Linux version 5.11.0-49-generic
GPU count and types: x8 RTX3090
Interconnects (if applicable) [e.g., two machines connected with 100 Gbps IB]
Python version: 3.8
Any other relevant info about your setup

Launcher context deepspeed

Docker context Are you using a specific docker image that you can share?

Additional context Add any other context about the problem here.

Aug 24 '22 07:08 henrydylan

Hey @henrydylan - have you tried just specifying an optimizer via e.g.

..."optimizer": {
    "type": "Adam",
    "params": {
      "lr": 0.00015
    }
  },...

Maybe that could help.

Also, you mention that "default optimizer must be used", but I am not sure if that is the case. The documentation does not seem to mention a default optimizer.

Sep 07 '22 13:09 trianxy

Hi @henrydylan - this issue is fairly stale and I suspect there are enough changes that the issue can't be debugged in its original form. If you are still having issues, or if others find this, please open a new issue and link this one.

Aug 14 '23 20:08 loadams

DeepSpeed DeepSpeed copied to clipboard

TypeError: getattr(): attribute name must be string

ds_report output

DeepSpeed C++/CUDA extension op report

NOTE: Ops not installed will be just-in-time (JIT) compiled at runtime if needed. Op compatibility means that your system meet the required dependencies to JIT install the op.

JIT compiled ops requires ninja ninja .................. [OKAY]

op name ................ installed .. compatible

DeepSpeed
DeepSpeed copied to clipboard