Training procedure does not look like Algorithm 1 in the main paper

Open v18nguye opened this issue 2 years ago • 1 comments

Hi authors,

I remarked that you only trained the loss $ L_{t}^{^} $ for all experiment while you proposed three types of loss in Algorithm 1. Could you explain the fact ? Thanks.

Oct 29 '23 21:10 v18nguye

Hi @v18nguye , and thanks for taking an interest in our work.

I'm guessing you are referring to the following lines in the training script: https://github.com/maxjcohen/diffusion-bridges/blob/4f722f6a982504149a1ae4008ce62ee621300637/scripts/ho/cifar.py#L77-L81

In this experiment, we start with an pre-trained VQ-VAE model, so we consider the other two terms of the loss function, $L^{rec}$ and $L^{reg}$, to already be optimized. In the training_step function, we attempt to reduce the last term, $L^{prior}$.

It is possible to train both the VQ-VAE and the prior model jointly, as we have shown in our paper, by computing all three loss terms and minimizing them jointly.

Nov 03 '23 10:11 maxjcohen