DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

[BUG] return getattr(args, f"{model_type[step_num]}_model")

Open koalawangyang opened this issue 1 year ago • 3 comments

Describe the bug when run train command from examples, error shows up.

[root@iZuf69ogcccxrcfuonlyzfZ DeepSpeed-Chat]# python train.py --actor-model facebook/opt-1.3b --reward-model facebook/opt-350m --deployment-type single_gpu
  File "train.py", line 110
    return getattr(args, f"{model_type[step_num]}_model")
                                                       ^
SyntaxError: invalid syntax

ds_report output Please run ds_report to give us details about your setup.

[root@iZuf69ogcccxrcfuonlyzfZ DeepSpeed-Chat]# python3 -m deepspeed.env_report
--------------------------------------------------
DeepSpeed C++/CUDA extension op report
--------------------------------------------------
NOTE: Ops not installed will be just-in-time (JIT) compiled at
      runtime if needed. Op compatibility means that your system
      meet the required dependencies to JIT install the op.
--------------------------------------------------
JIT compiled ops requires ninja
ninja .................. [OKAY]
--------------------------------------------------
op name ................ installed .. compatible
--------------------------------------------------
 [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.
 [WARNING]  async_io: please install the libaio-devel package with yum
 [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
async_io ............... [NO] ....... [NO]
cpu_adagrad ............ [NO] ....... [OKAY]
cpu_adam ............... [NO] ....... [OKAY]
fused_adam ............. [NO] ....... [OKAY]
fused_lamb ............. [NO] ....... [OKAY]
quantizer .............. [NO] ....... [OKAY]
random_ltd ............. [NO] ....... [OKAY]
 [WARNING]  sparse_attn requires a torch version >= 1.5 but detected 2.0
 [WARNING]  using untested triton version (2.0.0), only 1.0.0 is known to be compatible
sparse_attn ............ [NO] ....... [NO]
/usr/local/python3/lib/python3.10/site-packages/torch/cuda/__init__.py:546: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
spatial_inference ...... [NO] ....... [OKAY]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
transformer_inference .. [NO] ....... [OKAY]
utils .................. [NO] ....... [OKAY]
--------------------------------------------------
DeepSpeed general environment info:
torch install path ............... ['/usr/local/python3/lib/python3.10/site-packages/torch']
torch version .................... 2.0.0+cu117
deepspeed install path ........... ['/usr/local/python3/lib/python3.10/site-packages/deepspeed']
deepspeed info ................... 0.9.0, unknown, unknown
torch cuda version ............... 11.7
torch hip version ................ None
nvcc version ..................... 12.1
deepspeed wheel compiled w. ...... torch 2.0, cuda 11.7

System info (please complete the following information):

  • OS: [CentOS 7.9]
  • GPU count and types [single Nvidia A100 80G]
  • Python version: 3.10

Additional context I've got many errors when follow your official instructions to install DeepSpeed Chat. This error is the biggest one that I couldn't solve it myself. Please kindly help~

koalawangyang avatar Apr 14 '23 09:04 koalawangyang

@koalawangyang could you please add a print out before this line and output the contents of args: print(args)? I just tried the same command and I get no error. Also, please share the other errors you are seeing. We would like to continue to improve the scripts we have :)

mrwyattii avatar Apr 14 '23 16:04 mrwyattii

I also encountered the same problem

haoyu-lab avatar Apr 15 '23 15:04 haoyu-lab

I just ran the example: python train.py --actor-model facebook/opt-13b --reward-model facebook/opt-350m --deployment-type single_node

but I got the same issue: python train.py --actor-model facebook/opt-13b --reward-model facebook/opt-350m --deployment-type single_node File "train.py", line 110 return getattr(args, f"{model_type[step_num]}_model") ^ SyntaxError: invalid syntax

jokerjoey avatar Apr 23 '23 02:04 jokerjoey

I've solved this problem.

it's caused by the 'python' command will use the system default python 2.x to run the train.py script.

change the command to 'python3 train.py xxxxx' will work then.

koalawangyang avatar Apr 24 '23 03:04 koalawangyang

close this issue.

koalawangyang avatar Apr 24 '23 03:04 koalawangyang