custom_d_fine icon indicating copy to clipboard operation
custom_d_fine copied to clipboard

UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`.

Open JiawuTian opened this issue 5 months ago • 4 comments

Hello. Firstly I want to express my sincere appreciation to you for your effort in rewriting D-FINE from scratch. Your repository help me a lot in training D-FINE on my custom dataset.

However, when I used your train.py, I got a warning:

/home/tjw/.conda/envs/cd-fine/lib/python3.11/site-packages/torch/optim/lr_scheduler.py:227: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
  warnings.warn(

I try my best to solve this warning and, frustratingly, failed.

I put print() to debug, and have located that when call self.scheduler.step() in the line 344 of train.py on the first batch of first epoch, it can reproduce this warning.

The codes, by the way, optimizer_step() is a nested function of train(), containing print() are :

        def optimizer_step(step_scheduler: bool):
            """
            Clip grads, optimizer step, scheduler step, zero grad, EMA model update
            """
            nonlocal ema_iter
            if self.amp_enabled:
                if self.clip_max_norm:
                    print("test1")
                    self.scaler.unscale_(self.optimizer)
                    print("test2")
                    torch.nn.utils.clip_grad_norm_(self.model.parameters(), self.clip_max_norm)
                print("test3")
                self.scaler.step(self.optimizer)
                print("test4")
                self.scaler.update()
                print("test5")
            else:
                if self.clip_max_norm:
                    torch.nn.utils.clip_grad_norm_(self.model.parameters(), self.clip_max_norm)
                self.optimizer.step()

            if step_scheduler:
                self.scheduler.step()
                print("test6")
            
            self.optimizer.zero_grad()

Following screenshot is the warning: Image

The order of self.optimizer.step(), self.scheduler.step(), self.optimizer.zero_grad() is right.

Why it has this warning and how to solve it?

JiawuTian avatar Jul 09 '25 02:07 JiawuTian

Hey, thanks for kind words. The weird thing is that I saw that warning only 1 time with a different pytorch version that I usually use. I did try to debug it, but it seems to me that order is correct. Maybe it is a torch issue. Can you share what version do you have? I'll look into it again and maybe create a ticket in pytorch repo

ArgoHA avatar Jul 09 '25 05:07 ArgoHA

Hey, thanks for kind words. The weird thing is that I saw that warning only 1 time with a different pytorch version that I usually use. I did try to debug it, but it seems to me that order is correct. Maybe it is a torch issue. Can you share what version do you have? I'll look into it again and maybe create a ticket in pytorch repo

Thank you for your reply in a such short time.

Pytorch version is 2.6.0+cu124, which is got via:

python -c "import torch; print(torch.__version__)"

And that warning will only appear once every time you run train.py, and only in the first batch.

Other information related to GPU (NVIDIA GeForce RTX 4090, single card) may be useful are as follows: NVIDIA-SMI 550.144.03 Driver Version: 550.144.03 CUDA Version: 12.4

JiawuTian avatar Jul 09 '25 06:07 JiawuTian

ok, I think here is the solution https://discuss.pytorch.org/t/optimizer-step-before-lr-scheduler-step-error-using-gradscaler/92930/6

So this shouldn't be really an issue, but I will get an update to solve that warning

ArgoHA avatar Jul 09 '25 07:07 ArgoHA

ok, I think here is the solution https://discuss.pytorch.org/t/optimizer-step-before-lr-scheduler-step-error-using-gradscaler/92930/6

So this shouldn't be really an issue, but I will get an update to solve that warning

@ArgoHA So many thanks for your time digging solution for that warning. After updating the repository, I will be very happy if you can remind me.

JiawuTian avatar Jul 09 '25 08:07 JiawuTian