gpt-neox icon indicating copy to clipboard operation
gpt-neox copied to clipboard

ModuleAttributeError: 'DeepSpeedEngine' object has no attribute 'is_pipe_parallel'

Open sameeravithana opened this issue 3 years ago • 0 comments

Describe the bug Unable to run the evaluate.py with a gpt-neox model trained with pp=0, mp=1.

To Reproduce Train a 13B model with zero stage 2, pp=0, mp=1. Save checkpoint. Run the evaluate.py script, attached to the saved checkpoint.

Screenshots

Traceback (most recent call last):

  File "../../gpt-neox/evaluate.py", line 62, in <module>

  File "../../gpt-neox/evaluate.py", line 62, in <module>

    main()

  File "../../gpt-neox/evaluate.py", line 39, in main

    main()

  File "../../gpt-neox/evaluate.py", line 39, in main

    results = run_eval_harness(model, forward_step, neox_args, eval_tasks=neox_args.eval_tasks,

  File "../gpt-neox/eval_tasks/eval_adapter.py", line 222, in run_eval_harness

    results = run_eval_harness(model, forward_step, neox_args, eval_tasks=neox_args.eval_tasks,

  File "../gpt-neox/eval_tasks/eval_adapter.py", line 222, in run_eval_harness

        adapter = EvalHarnessAdapter(model, forward_step_fn, neox_args, batch_size)adapter = EvalHarnessAdapter(model, forward_step_fn, neox

_args, batch_size)



  File "../gpt-neox/eval_tasks/eval_adapter.py", line 58, in __init__

  File "../gpt-neox/eval_tasks/eval_adapter.py", line 58, in __init__

    self.is_pipe_parallel = self.model.is_pipe_parallel

  File "../lib/python3.8/site-packages/torch/nn/modules/module.py", line 778, in __getattr__

    self.is_pipe_parallel = self.model.is_pipe_parallel

  File "../lib/python3.8/site-packages/torch/nn/modules/module.py", line 778, in __getattr__

    raise ModuleAttributeError("'{}' object has no attribute '{}'".format(

    raise ModuleAttributeError("'{}' object has no attribute '{}'".format(

torch.nn.modules.module.ModuleAttributeError: torch.nn.modules.module'DeepSpeedEngine' object has no attribute 'is_pipe_parallel'.

ModuleAttributeError: 'DeepSpeedEngine' object has no attribute 'is_pipe_parallel'

Killing subprocess 6522

sameeravithana avatar Jul 07 '22 17:07 sameeravithana