animate-anything
animate-anything copied to clipboard
how to control motion magnitude
In the train.py file, I noticed for motion control, the motion magnitudes are computed in both RGB(batch["motion"]) and latent space (latent_motion = calculate_latent_motion_score(latents)), but only the latter is used in unet prediction. Could you explain why not use the former one but only use the latter one?
In consideration of efficiency, if we opt to employ motion representation in the RGB format, it will be required to transform predict_x0
back into the RGB space in the line of code motion_loss = F.mse_loss(latent_motion, calculate_latent_motion_score(predict_x0))
.