Mehdi Cherti comments

Results 51 comments of


Mehdi Cherti

Not an issue - richer datasets

" does that mean there's no 500x iterations to get a good looking image?" Yes

Not an issue - richer datasets

Following the tweet you mentioned above, here is an example with "deviantart, volcano": https://imgur.com/a/cYMsNo5 with a model currently being trained on conceptual captions 12m.

Not an issue - richer datasets

@johndpope I added a bunch of pre-trained models if you want to give it a try

Slow Training Speed

Hi @s13kman, thanks for your interest! I would suggest to use multi-gpu training to speed up training since you have access to multiple GPUs. Actually multi-gpu is supported through Horovod...

Slow Training Speed

Hi @CrossLee1 sorry for replying until now, so it takes around 6 hours, but I train them on 64 A100 GPUs (data parallel with Horovod) to speed up the process....

Positional Stickiness

Yes I see exactly what you mean and noticed this in all the models I trained (both VitGAN and mlp_mixer), this was the reason why I started by the way...

Positional Stickiness

It could also be related to the architectures themselves (VitGAN and mlp_mixer), not sure. Another way to make the constraint even more explicit is to add a diversity loss on...

Positional Stickiness

"For instance - the "photo of san francisco" captions tend to produce wildly different outputs " Ah okay, so what are the text prompts here where you observed different outputs,...

Positional Stickiness

Yes exactly!

Positional Stickiness

Another way to see is through interpolation, here is an video showing interpolation (of text encoded features) from "the sun is a vanishing point absorbing shadows" to "the moon is...