DiffusionFromScratch Normalize the pred of the network but not for the ground truth of score. Why?

Normalize the pred of the network but not for the ground truth of score. Why?

Open Guanbin-Huang opened this issue 1 year ago • 1 comments

Hi! Animadversio

Firstly, I'd like to commend you on your work; it's truly impressive! I've been closely examining the sample_X_and_score experiment in the ReverseSDE_Diffusion.ipynb notebook and have a few clarifications to seek.

In your approach to approximate the analytical score with a Neural Network, I noticed that while constructing the data-ground-truth pair, there doesn't seem to be any normalization applied to the ground truth of the score. However, in the code snippet, there appears to be a form of normalization in the fourth line. This has left me a bit puzzled. Specifically, while y_pred is normalized, y_train (which serves as the ground truth) isn't normalized by the timestep. This gives an impression that they might be on different scales.

-------- code ---------- for ep in pbar: y_pred = score_model_analy(X_train, T_train) loss = torch.mean(torch.sum((y_pred - y_train)**2 * std_vec[:, None], dim=(1))) # <======= here -------- code ---------- Moreover, in the forward method, the output is normalized:

-------- code ---------- def forward(self, x, t): t_embed = self.embed(t) pred = self.net(torch.cat((x,t_embed),dim=1)) pred = pred / self.marginal_prob_std_f(t)[:, None,]. # <======= here return pred -------- code ----------

Given this, I'm curious about how arithmetic operations can be applied on y_pred and y_train if they're potentially on different scales.

I've also taken a look at the Colab notebook you provided (link) and attached an image for reference: I'd greatly appreciate your insights on this matter. Thank you for your time and patience.

Warm regards,

Aug 11 '23 12:08 Guanbin-Huang

in this example(https://colab.research.google.com/drive/1_MEFfBdOI06GAuANrs1b8L-BBLn3x-ZJ?usp=sharing#scrollTo=PtAwqb0QQUUn ) , "Defining the loss function" section :

  z = torch.randn_like(x)             # get normally distributed noise
  perturbed_x = x + z * std[:, None, None, None]

  score = model(perturbed_x, random_t)
  loss = torch.mean(torch.sum((score * std[:, None, None, None] + z)**2, dim=(1,2,3)))

it seems the 'score' is normalized (the last line in forward function ) while 'z' is just a normal distribution tensor between (0,1) I think loss should be

 loss = torch.mean(torch.sum((score + z)**2, dim=(1,2,3)))

   loss = torch.mean(torch.sum((score * std[:, None, None, None] + z * std[:, None, None, None])**2, dim=(1,2,3)))

Nov 23 '23 04:11 YuboLong

DiffusionFromScratch DiffusionFromScratch copied to clipboard

Normalize the pred of the network but not for the ground truth of score. Why?

DiffusionFromScratch
DiffusionFromScratch copied to clipboard