PaddleDetection
PaddleDetection copied to clipboard
单机多卡训练rt-detrv2-r101,loss反向传播报错ValueError: (InvalidArgument) Required tensor shall not be nullptr, but received nullptr.
问题确认 Search before asking
Bug组件 Bug Component
Training
Bug描述 Describe the Bug
当我使用下述指令训练rt-detr的时候:
python -m paddle.distributed.launch --gpus 0,1,2 tools/train.py -c configs/rtdetrv2/rtdetrv2_r101vd_6x_coco.yml --fleet --eval
会出现报错:
Traceback (most recent call last):
File "/home/zqy/zqy/Codes/PaddleDetection/tools/train.py", line 209, in <module>
main()
File "/home/zqy/zqy/Codes/PaddleDetection/tools/train.py", line 205, in main
run(FLAGS, cfg)
File "/home/zqy/zqy/Codes/PaddleDetection/tools/train.py", line 158, in run
trainer.train(FLAGS.eval)
File "/home/zqy/zqy/Codes/PaddleDetection/ppdet/engine/trainer.py", line 614, in train
loss.backward()
File "/usr/local/lib/python3.10/dist-packages/decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "/usr/local/lib/python3.10/dist-packages/paddle/base/wrapped_decorator.py", line 26, in __impl__
return wrapped_func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/paddle/base/framework.py", line 593, in __impl__
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/paddle/base/dygraph/tensor_patch_methods.py", line 342, in backward
core.eager.run_backward([self], grad_tensor, retain_graph)
ValueError: (InvalidArgument) Required tensor shall not be nullptr, but received nullptr.
[Hint: tensor should not be null.] (at ../paddle/phi/core/device_context.cc:142)
当我使用单卡训练的时候就不会报错了
复现环境 Environment
- OS:Linux
- PaddlePaddle: paddlepaddle-gpu 2.6.1.post117和 paddlepaddle-gpu 2.6.0.post117
- PaddleDetection: develop/2.7
- python: 3.10
- CUDA: 11.7
Bug描述确认 Bug description confirmation
- [X] 我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息,确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.
是否愿意提交PR? Are you willing to submit a PR?
- [ ] 我愿意提交PR!I'd like to help by submitting a PR!