LED
LED copied to clipboard
questions about stage 1 training
Hi,
Thank you for sharing your code. As mentioned in your paper, there are 2 stages of training. First for the denoising diffusion model, and second focuses on the leapfrog initializer. It seems the repo provides the code of stage 2 training, which loaded the pretraining checkpoint of the denoising diffusion model directly. Could you also provide the code for stage 1 training? Do you use the leapfrog initializer in the first stage? If so, what are the initialized values of the estimated mean, variance, and sample prediction you used? Thanks!