DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

[BUG] Load fused_adam_cuda failed

Open Fazziekey opened this issue 2 years ago • 2 comments

Describe the bug I install the deepspeed with DS_BUILD_OPS=1 pip install deepspeed and compile the extra keenal, but I get err when I use zero optimizer, I get fused_adam_cuda = FusedAdamBuilder().load(), TypeError: 'NoneType' object is not callable

image

Expected behavior A clear and concise description of what you expected to happen.

ds_report output I run ds_report and get this error image

System info (please complete the following information):

  • OS: CentOS
  • GPU count and types: 1machines with x8 A100s each
  • Python version: 3.9.12
  • Cuda:11.3
  • torch:1.13

Launcher context deepspeed --num_gpus=2 train.py

Fazziekey avatar Feb 22 '23 07:02 Fazziekey

May about cuda version,please check nvcc -v. Or up cuda to 11.8.

GalaxyHe2023 avatar May 25 '23 03:05 GalaxyHe2023

Hi @Fazziekey - are you still having this issue? And if so, can you try and repro with the latest DeepSpeed?

loadams avatar Jun 12 '23 21:06 loadams

Closing this issue now for no reply, if anyone is still hitting this, feel free to re-open.

loadams avatar Jun 26 '23 17:06 loadams