EDGE
EDGE copied to clipboard
Why use x_start as the target in each timestep of diffusion training?
I have seen using noise, x_noisy or v_prediction, etc. as the training target, but each timestep uses x_start as the training target, which seems a bit strange. Can you explain it or provide relevant articles?