diffusion
diffusion copied to clipboard
Reconstructed celeb_a_hq images have good edges but wrong color histogram. Why?
Hi @hojonathanho. First of all, thank you for your code. I'm training the model on 256x256 images of the celeb_a_hq dataset taken from Kaggle. The parameters I'm using are:
- ema_decay=1e-10
- optimizer='adam'
- dataset_dir='tensorflow_datasets'
- warmup=5000
- num_diffusion_timesteps=1000
- beta_start=0.0001
- beta_end=0.02
- grad_clip=1.
- beta_schedule='linear'
- randflip=1
- batch_size=3
- img_shape=[256,256,3]
- model_name='unet2d16b2c112244'
- lr=0.00002
- loss_type='noisepred'
- dropout=0.0
- randflip=1
- block_size=1
I train the model for more or less 20.000 steps, and the output is the result of using:
out = unet.model(
x, t=t, y=y, name='model', ch=128, ch_mult=(1, 1, 2, 2, 4, 4), num_res_blocks=2, attn_resolutions=(16,),
out_ch=out_ch, num_classes=self.num_classes, dropout=dropout
)
The problem is that the best result that I got so far is the following (up: original, below: final result):
I always have a "blue-ish" filter on the image. I believe this is caused by the loss function. Its job is to predict the noise instead of x_start. seems to be weighted naturally like SNR
as you wrote, but by doing so we have changes in the image colors, producing a Gaussian distribution similar to the noise's one:
Why is this loss used even if it changes the color spectrum? Am I missing something, like the correct way to obtain an output?
Any updates on this? I am facing similar issues
Hi!
I don't remember the specific issue that I was facing, but I suggest you try the function p_sample_loop
to generate novel images from noise. The self._denoise
argument corresponds to the method of an object of type, for example, Model_Celeb(gpu_utils.Model):
that returns the output of the model. p_sample_loop
corresponds to the inverse process described in the paper. It should start from NOISE, not from another image, from what I've understood.
Let me know if this solves your issue.