mmrotate [Bug]

[Bug]

Open PLEXATIC opened this issue 1 year ago • 0 comments

Prerequisite

[X] I have searched Issues and Discussions but cannot get the expected help.
[X] I have read the FAQ documentation but cannot get the expected help.
[X] The bug has not been fixed in the latest version (master) or latest version (1.x).

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

master branch https://github.com/open-mmlab/mmrotate

Environment

Not very relevant...

Reproduces the problem - code sample

Reproduces the problem - command or script

python tools/train.py someconfig.py

Reproduces the problem - error message

No Error. I am convinced that the computed Gradient Norm that is printed to console during training must be wrong. Why?

Because sometimes, the gradient norm can be infinity- This would lead to infinitely big step-sizes, which would cause training to fail. However, it's possible to get a gradient norm of infinity and still also get successfull training.
The gradient norm sometimes jumps around- but doesn't seem to influence the loss much. In a recent training I had the gradient norm jump from some normal value like 6 to about 600 within a single step. The loss wasn't notably influenced by this, which seems odd- A big gradient norm would cause a big step to correct the weights and therefore likely also a rather big change in loss.

Additional information

I think the gradient norm is computed somehow incorrectly because of the points mentioned above. If this is not the case and the gradient norm is computed correctly, it might be worth documenting this behaviour since it is likely not what one would expect.

Jul 18 '23 12:07 PLEXATIC

mmrotate mmrotate copied to clipboard

[Bug]

Prerequisite

Task

Branch

Environment

Reproduces the problem - code sample

Reproduces the problem - command or script

Reproduces the problem - error message

Additional information

mmrotate
mmrotate copied to clipboard