Phil Wang

Results 814 comments of Phil Wang

that's interesting, I'm not sure there is a subtle difference in the resnet blocks. I'm using the GLIDE style architecture here with norm, activation, then project However, the original ddpm...

@babysor yea, just jump in with a pull request @Diaz1980 no there is not

@pfeatherstone i think i allow for bidirectional shifting, maybe that's why i can check later

@aliabid2243 Hi Abid! Which version of pytorch are you on?

what is new in the most recent paper that is not in the repo?

@yiyixuxu :rocket: looks great YiYi! I added a link to it from the readme

@Hosein47 do you have experiments setup to measure oversmoothing? part of me wonders if it is even a problem worth solving, given chatgpt has shown scale and data matters way...

try https://github.com/lucidrains/x-transformers#gated-residual for starters, and if you see it alleviate oversmoothing, i can add a simpler technique

@ghpkishore architecture-wise it is pretty much complete there are no plans to train it, as it seems like everyone is preferring the denoising diffusion approach

well, will you look at that https://arxiv.org/abs/2310.05737 guess i'll put more work into this