Gabriel Mongaras
Gabriel Mongaras
Nice!! Glad you got it working. I know LFS is a bit weird sometimes.
I initially had a feature where you could change the voice model given sample audio, but since the custom voice model was way too large and way too slow, I...
Sorry, but at the moment I don't have too much time to work on getting the custom voice module to work. The main issue is it was very experimental and...
Sorry, but I'm not quite sure what you mean by this.
Aside from blinking and lip syncing, I didn't add any other movement to the image. It would be a cool feature to add though!
Oh yeah, that makes sense! As I've learned more about diffusion models, it looks like predicting x_0 produces better results as one can skip steps like in DDIM.
Thanks for letting me know! Are you using an older version of PyTorch? I think einsum used to be limited to lowercase characters as this GitHub issue shows: https://github.com/pytorch/pytorch/issues/21412 I...
Changing capital C to lowercase c will probably run into errors since the einsum will do the multiplication incorrectly. Try changing it from capital C to lowercase d: X =...
I added a training log here: https://github.com/gmongaras/Diffusion_models_from_scratch/blob/main/results/res_res_partial_log.out What do you mean when you say you moved this project to a segmentation task? Are you using a pre-trained model and finetuning...
That's just for logging. Instead of outputting the latest loss value (-1), I output the mean of the latest 10 losses (-10:) to reduce noise in the output loss value.