unet-stylegan2 icon indicating copy to clipboard operation
unet-stylegan2 copied to clipboard

Problems with parameters in face generation

Open rockdrigoma opened this issue 5 years ago • 15 comments

Just tried to train a model with a 3.5K-face dataset but it seems strange. My command was this: unet_stylegan2 --batch_size 4 --network_capacity 32 --gradient_accumulate_every 8 --aug_prob 0.25 --attn-layers [1,2]

This is on the 14 epoch (14000)

14

Any suggestion to improve this or just wait a little bit more? I had better results at this point with regular stylegan2

rockdrigoma avatar Aug 25 '20 20:08 rockdrigoma

@rockdrigoma ohh darn, what's the values for D: and G: at this point?

lucidrains avatar Aug 25 '20 23:08 lucidrains

i have to admit, this repository is a bit experimental, and i'm not sure if it will work for datasets with less than 10k samples

lucidrains avatar Aug 25 '20 23:08 lucidrains

@rockdrigoma ohh darn, what's the values for D: and G: at this point?

9% 13950/149000 [9:41:20<95:32:18, 2.55s/it]G: 0.84 | D: 0.71 | GP: 0.36 | PL: 0.50 | CR: 0.00 | Q: 0.00

rockdrigoma avatar Aug 26 '20 00:08 rockdrigoma

@rockdrigoma hmmm, that looks fine actually... maybe let it sit overnight and see where it is at at 50k

lucidrains avatar Aug 26 '20 00:08 lucidrains

@rockdrigoma hmmm, that looks fine actually... maybe let it sit overnight and see where it is at at 50k

What's the meaning of each metric?

rockdrigoma avatar Aug 26 '20 00:08 rockdrigoma

@rockdrigoma the way it works in GANs is there are two AIs. One AI learns to generate, and the other learns to determine if the image is fake (generated) or real. They are pit against each other, and they both become really good at some point. The numbers monitor that they are evenly balanced. If one becomes too powerful, both stop learning

lucidrains avatar Aug 26 '20 00:08 lucidrains

@rockdrigoma you want to see both numbers above 0, but also below 20

lucidrains avatar Aug 26 '20 00:08 lucidrains

@rockdrigoma how did it go?

lucidrains avatar Aug 26 '20 19:08 lucidrains

@rockdrigoma how did it go?

Epoch 32 1% 598/118000 [35:19<90:35:00, 2.78s/it]G: 2.40 | D: 4.00 | GP: 0.06 | PL: 1.17 | CR: 0.00

mr.jpg

32-mr

rockdrigoma avatar Aug 26 '20 19:08 rockdrigoma

ohh darn :( could you try it without augmentation probs? so this framework actually already includes augmentation intrinsically (cutmix), and perhaps the differentiable random crops are making it too difficult for the generator

lucidrains avatar Aug 26 '20 20:08 lucidrains

the numbers do look good though! perhaps we just aren't being patient, but i suspect the extra augmentation is making it too difficult

lucidrains avatar Aug 26 '20 20:08 lucidrains

default<data/>: 8%|█▏ | 11650/150000 [8:20:38<90:07:32, 2.35s/it]G: -338.38 | D: 82127.60 | GP: 75818352640.00 | PL: 0.00 | CR: 19274924032.00 default<data/>: 8%|█▏ | 11700/150000 [8:22:37<89:36:36, 2.33s/it]G: 3379234816.00 | D: 35068168.00 | GP: 31833282560.00 | PL: 0.00 | CR: 13639467512365056.00 These values seem extremely large. Should I just be patient or is something going wrong? I used the configuration you provided in this repository except that I increased the resolution from 128x128 to 256x256 and lowered the batch size to 2 (to fit on my gpu). The latest output looks likes this: 11

Here's my experience: On a dataset of 3,200 gemstones, lightweight-gan and stylegan2 both produce recognizable, but not great results after 12 epochs.

For example, stylegan2 after 13 epochs:

13

Unet-stylegan2 with --aug-prob 0.25 --top-k-training produces stuff like this:

12

After taking off --aug-prob 0.25 --top-k-training, and training a few more epochs, I get stuff like this:

16

It seems like unet-stylegan2 indeed does not work well with a small amount of data even with augmentation, and removing augmentation makes it worse.

jtremback avatar Dec 08 '20 02:12 jtremback

default<data/>: 8%|█▏ | 11650/150000 [8:20:38<90:07:32, 2.35s/it]G: -338.38 | D: 82127.60 | GP: 75818352640.00 | PL: 0.00 | CR: 19274924032.00 default<data/>: 8%|█▏ | 11700/150000 [8:22:37<89:36:36, 2.33s/it]G: 3379234816.00 | D: 35068168.00 | GP: 31833282560.00 | PL: 0.00 | CR: 13639467512365056.00 These values seem extremely large. Should I just be patient or is something going wrong? I used the configuration you provided in this repository except that I increased the resolution from 128x128 to 256x256 and lowered the batch size to 2 (to fit on my gpu). The latest output looks likes this: 11

I am facing quite the same problem as yours. Just a few iterations and the losses come to: d: 19807393792.000; g: 5749469.500; r1: 103350984704.000 So any ideas? TAT

JinshuChen avatar Jul 05 '21 11:07 JinshuChen

Do you guys have any update on that issue?

sandrodevdariani avatar Jul 07 '23 06:07 sandrodevdariani