DiffusionFromScratch icon indicating copy to clipboard operation
DiffusionFromScratch copied to clipboard

Normalize the pred of the network but not for the ground truth of score. Why?

Open Guanbin-Huang opened this issue 1 year ago • 1 comments

Hi! Animadversio

Firstly, I'd like to commend you on your work; it's truly impressive! I've been closely examining the sample_X_and_score experiment in the ReverseSDE_Diffusion.ipynb notebook and have a few clarifications to seek.

In your approach to approximate the analytical score with a Neural Network, I noticed that while constructing the data-ground-truth pair, there doesn't seem to be any normalization applied to the ground truth of the score. However, in the code snippet, there appears to be a form of normalization in the fourth line. This has left me a bit puzzled. Specifically, while y_pred is normalized, y_train (which serves as the ground truth) isn't normalized by the timestep. This gives an impression that they might be on different scales.

-------- code ---------- for ep in pbar: y_pred = score_model_analy(X_train, T_train) loss = torch.mean(torch.sum((y_pred - y_train)**2 * std_vec[:, None], dim=(1))) # <======= here -------- code ---------- Moreover, in the forward method, the output is normalized:

-------- code ---------- def forward(self, x, t): t_embed = self.embed(t) pred = self.net(torch.cat((x,t_embed),dim=1)) pred = pred / self.marginal_prob_std_f(t)[:, None,]. # <======= here return pred -------- code ----------

Given this, I'm curious about how arithmetic operations can be applied on y_pred and y_train if they're potentially on different scales.

I've also taken a look at the Colab notebook you provided (link) and attached an image for reference: image I'd greatly appreciate your insights on this matter. Thank you for your time and patience.

Warm regards,

Guanbin-Huang avatar Aug 11 '23 12:08 Guanbin-Huang

in this example(https://colab.research.google.com/drive/1_MEFfBdOI06GAuANrs1b8L-BBLn3x-ZJ?usp=sharing#scrollTo=PtAwqb0QQUUn ) , "Defining the loss function" section :

  z = torch.randn_like(x)             # get normally distributed noise
  perturbed_x = x + z * std[:, None, None, None]

  score = model(perturbed_x, random_t)
  loss = torch.mean(torch.sum((score * std[:, None, None, None] + z)**2, dim=(1,2,3)))

it seems the 'score' is normalized (the last line in forward function ) while 'z' is just a normal distribution tensor between (0,1) I think loss should be

 loss = torch.mean(torch.sum((score + z)**2, dim=(1,2,3)))

or

   loss = torch.mean(torch.sum((score * std[:, None, None, None] + z * std[:, None, None, None])**2, dim=(1,2,3)))

YuboLong avatar Nov 23 '23 04:11 YuboLong