LLaVA icon indicating copy to clipboard operation
LLaVA copied to clipboard

[Usage] None of the inputs have requires_grad=True. Gradients will be None

Open hellangleZ opened this issue 9 months ago • 12 comments

Describe the issue

Issue:

Log said gradient will be none

Command:

PASTE THE COMMANDS HERE.

just using pretran

Log:

/data22/llava/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
  warnings.warn(
/data22/llava/lib/python3.10/site-packages/torch/utils/checkpoint.py:61: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn(
/data22/llava/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
  warnings.warn(
/data22/llava/lib/python3.10/site-packages/torch/utils/checkpoint.py:61: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn(
/data22/llava/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.

Screenshots: You may attach screenshots if it better explains the issue. image

hellangleZ avatar Apr 30 '24 09:04 hellangleZ

/data22/llava/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants. warnings.warn( /data22/llava/lib/python3.10/site-packages/torch/utils/checkpoint.py:61: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( /data22/llava/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants. warnings.warn( /data22/llava/lib/python3.10/site-packages/torch/utils/checkpoint.py:61: UserWarning: None of the inputs have requires_grad=True. Gradients will be None warnings.warn( {'loss': 1.8577, 'learning_rate': 3.205128205128205e-07, 'epoch': 0.0}
{'loss': 1.7297, 'learning_rate': 6.41025641025641e-07, 'epoch': 0.0}
{'loss': 1.866, 'learning_rate': 9.615384615384617e-07, 'epoch': 0.0}
{'loss': 2.0846, 'learning_rate': 1.282051282051282e-06, 'epoch': 0.0}

Will anyone could Help to see that?

Thanks

hellangleZ avatar May 01 '24 03:05 hellangleZ

Have you solved it? I encountered the same problem,

LijunZhang01 avatar May 13 '24 12:05 LijunZhang01

我也是这个问题,请问有解决吗

PzWHU avatar May 16 '24 03:05 PzWHU

I have the same problem

xiaxiangzhou avatar May 18 '24 20:05 xiaxiangzhou

I have the same problem

y-rok avatar May 30 '24 11:05 y-rok

有解决吗?

PangziZhang523 avatar Jun 19 '24 16:06 PangziZhang523

有解决吗?

@PangziZhang523 hello,你解决了吗?我也是,但我看 loss 还是正常在下降

dacian7 avatar Jul 01 '24 09:07 dacian7

+1

SuperBruceJia avatar Jul 06 '24 05:07 SuperBruceJia

Any solutions?

SuperBruceJia avatar Jul 06 '24 05:07 SuperBruceJia

Any solutions?

@SuperBruceJia It is not a problem. Just ignore it. Everything is fine. I completed the training and the model is fine

dacian7 avatar Jul 06 '24 05:07 dacian7

@dacian7 Thank you very much for your quick reply!

However, I have encountered further issues after this problem. RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

image

SuperBruceJia avatar Jul 06 '24 05:07 SuperBruceJia

The warning comes from the ViT component, which is already frozen, so you can ignore it. It’s related to gradient checkpointing, which is used to save memory. You’ll notice that the LLM has already called enable_input_require_grads(). After ViT, the MM-projector also needs gradients, so the backward pass for these two components is normal.

If you want to remove this warning (though it might not be necessary), you can call vit.enable_input_require_grads() and remove torch.no_grad() in CLIP.

Note, according to my understanding, if you plan to modify anything related to ViT, those changes are essential.

jiazhen-code avatar Sep 15 '24 06:09 jiazhen-code