gpt-neox
gpt-neox copied to clipboard
ModuleAttributeError: 'DeepSpeedEngine' object has no attribute 'is_pipe_parallel'
Describe the bug Unable to run the evaluate.py with a gpt-neox model trained with pp=0, mp=1.
To Reproduce Train a 13B model with zero stage 2, pp=0, mp=1. Save checkpoint. Run the evaluate.py script, attached to the saved checkpoint.
Screenshots
Traceback (most recent call last):
File "../../gpt-neox/evaluate.py", line 62, in <module>
File "../../gpt-neox/evaluate.py", line 62, in <module>
main()
File "../../gpt-neox/evaluate.py", line 39, in main
main()
File "../../gpt-neox/evaluate.py", line 39, in main
results = run_eval_harness(model, forward_step, neox_args, eval_tasks=neox_args.eval_tasks,
File "../gpt-neox/eval_tasks/eval_adapter.py", line 222, in run_eval_harness
results = run_eval_harness(model, forward_step, neox_args, eval_tasks=neox_args.eval_tasks,
File "../gpt-neox/eval_tasks/eval_adapter.py", line 222, in run_eval_harness
adapter = EvalHarnessAdapter(model, forward_step_fn, neox_args, batch_size)adapter = EvalHarnessAdapter(model, forward_step_fn, neox
_args, batch_size)
File "../gpt-neox/eval_tasks/eval_adapter.py", line 58, in __init__
File "../gpt-neox/eval_tasks/eval_adapter.py", line 58, in __init__
self.is_pipe_parallel = self.model.is_pipe_parallel
File "../lib/python3.8/site-packages/torch/nn/modules/module.py", line 778, in __getattr__
self.is_pipe_parallel = self.model.is_pipe_parallel
File "../lib/python3.8/site-packages/torch/nn/modules/module.py", line 778, in __getattr__
raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: torch.nn.modules.module'DeepSpeedEngine' object has no attribute 'is_pipe_parallel'.
ModuleAttributeError: 'DeepSpeedEngine' object has no attribute 'is_pipe_parallel'
Killing subprocess 6522