detrex icon indicating copy to clipboard operation
detrex copied to clipboard

DINO traning error when using internimage

Open winpih opened this issue 2 years ago • 2 comments

The following error occurs. Please check.

/usr/local/lib/python3.8/dist-packages/torch/autograd/init.py:173: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. grad.sizes() = [1280, 1, 3, 3], strides() = [9, 1, 3, 1] bucket_view.sizes() = [1280, 1, 3, 3], strides() = [9, 9, 3, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:326.) Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass

winpih avatar Apr 10 '23 09:04 winpih

The following error occurs. Please check.

/usr/local/lib/python3.8/dist-packages/torch/autograd/init.py:173: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. grad.sizes() = [1280, 1, 3, 3], strides() = [9, 1, 3, 1] bucket_view.sizes() = [1280, 1, 3, 3], strides() = [9, 9, 3, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:326.) Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass

OK, we will check the bug recently~

rentainhe avatar Apr 10 '23 17:04 rentainhe

The following error occurs. Please check. /usr/local/lib/python3.8/dist-packages/torch/autograd/init.py:173: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance. grad.sizes() = [1280, 1, 3, 3], strides() = [9, 1, 3, 1] bucket_view.sizes() = [1280, 1, 3, 3], strides() = [9, 9, 3, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:326.) Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass

OK, we will check the bug recently~

Thank you sir!!

winpih avatar Apr 11 '23 00:04 winpih