improved-diffusion A problem about the weight λ of Lvlb

A problem about the weight λ of Lvlb

Open yinguanchun opened this issue 1 year ago • 3 comments

In the paper, λ is 0.001. The code sets learn_sigma as True and rescale_learned_sigmas as False, so the loss type will be gd.LossType.MSE, in this loss type ,the Lvlb will not multply 0.001. Even if the loss type is gd.LossType.RESCALED_MSE, terms["vb"] *= self.num_timesteps / 1000.0, what is self.num_timesteps, and what is its effect？ Thank you .

Sep 30 '23 09:09 yinguanchun

@yinguanchun I am also confused about this scaling factor, have you understood that?

Jan 14 '24 05:01 zen-d

I am also confused about this scaling factor, have you understood that?

May 23 '24 03:05 Feynman1999

In my opinion, authors define L_{vlb} = L_0 + ... + L_T, not L_t. Thus, they may calculate the vlb loss with scale factor T (self.num_timestep).

Aug 01 '24 00:08 yhy258

improved-diffusion improved-diffusion copied to clipboard

A problem about the weight λ of Lvlb

improved-diffusion
improved-diffusion copied to clipboard