Moore-AnimateAnyone RuntimeError: Trying to backward through the graph a second time

I think training all networks together will be better, so I set

    reference_unet.requires_grad_(True)
    denoising_unet.requires_grad_(True)
    pose_guider.requires_grad_(True)

however, when I use gradient_checkpointing: True in the train config yaml, it raised error RuntimeError: Trying to backward through the graph a second time And I found if I set

    reference_unet.requires_grad_(False)
    denoising_unet.requires_grad_(True)
    pose_guider.requires_grad_(True)

it will be OK So what's wrong with the reference_unet? Could you please help?

May 16 '24 08:05 TZYSJTU

I think training all networks together will be better, so I set
    reference_unet.requires_grad_(True)
    denoising_unet.requires_grad_(True)
    pose_guider.requires_grad_(True)
however, when I use gradient_checkpointing: True in the train config yaml, it raised error RuntimeError: Trying to backward through the graph a second time And I found if I set
    reference_unet.requires_grad_(False)
    denoising_unet.requires_grad_(True)
    pose_guider.requires_grad_(True)
it will be OK So what's wrong with the reference_unet? Could you please help?

I also encountered this problem. When training referencenet, turning on gradient_accumulation_step will report an error RuntimeError: Trying to backward through the graph a second time. If don't train referencenet, it will be normal. Have you found a solution?

Oct 09 '24 05:10 zhuochen02

I have a similar problem. Have you found a solution?

Removing denoising_unet.enable_gradient_checkpointing() works but the GPUs go OOM.

Nov 12 '24 14:11 antoinedelplace