JoonHyuk Seo
Results
2
comments of
JoonHyuk Seo
In my opinion, authors define L_{vlb} = L_0 + ... + L_T, not L_t. Thus, they may calculate the vlb loss with scale factor T (self.num_timestep).
In my opinion, when you predict the x_start in t \approx T, with a cosine noise schedule, bar alphas (cumprod alphas) have very small values compared to linear noise schedules....