cdm icon indicating copy to clipboard operation
cdm copied to clipboard

Lambda selection and training with limited resources

Open jgvinholi opened this issue 1 year ago • 0 comments

Hi, I´d like to ask you what was the chosen value for lambda in the loss function calculation, i.e. the value that multiplies the consistency term of the loss.

Do you think it is best to train from scratch with the proposed loss function, or is it more interesting from your point of view to pretrain with the DDPM loss first, and only then change the loss function expression?

Also, since I don´t have 8 V100 GPUs, I was wondering whether it is possible to use a single A6000, that has 48GB of RAM, to train it using a much smaller batch size and model size (+- 100 million parameters). The task is different from yours though (requires less guessing from the model), it is more related to image to image translation. Do you believe that the CDM loss function would bring better stability? I managed to train a DDPM that works fairly well for my problem, but the generated samples are not very consistent, i.e. even when using the same condition input X and the same number of denoising steps, if I change the seed of the random number generator the quality of the results can vary a lot.

Thanks a lot for your help. Impressive work.

jgvinholi avatar Sep 28 '23 11:09 jgvinholi