DualStyleGAN
DualStyleGAN copied to clipboard
About reproduce Anime style?
Hi, thanks for sharing code.
I download the Anime dataset according to the README.md, but I get bad result like this:
Can you provide more details about trainning Anime dataset.
Here are some of my training results, is that correct?I used your provided pretrain model generator-pretrain.pt so I skip stage I&II, and fintune dualstylegan(stage III) using parameters as peper:
The picture above is fintune-004800.png,I use one GPU, so I trained 8*600=4800 with batch=4
The picture above is destylization log picture.
The picture above is dualstylegan-002000.jpg
Hoping for hearing from you soon.
Thank you for your interests.
Dataset
Hi, thanks for sharing code. I download the Anime dataset according to the README.md, but I get bad result like this:
This style image is not in the Anime dataset I trained my model. Did you add new images in the Anime dataset?
Batch Size and Iterations
[image] The picture above is fintune-004800.png,I use one GPU, so I trained 8*600=4800 with batch=4
Obviously, the network is over fine-tuned. You should use much fewer iterations, so that the original real faces can be still recognized. Another problem is that it is hard to train StyleGAN/DualStyleGAN with a small batch size. Here is a comparion between batch=4*1=4 and batch=4*8=32
You see, even I did not over fine-tune the StyleGAN, I still get poor results with a single GPU.
Destylzation
The picture above is destylization log picture.
Since your StyleGAN is over fine-tuned, the destylization results have few content correspondences with the input anime faces. Here is my result:
Finetune DualStyleGAN
[image] The picture above is dualstylegan-002000.jpg
I also found that finetuning on Anime dataset is the most unstable one among the finetunings on all datasets due to the large discrapancy between real faces and anime faces. And on a single GPU, it becomes much more unstable. I use the following settings to obtain the performance above.
On a single GPU, the model often falled into mode collapse so I trained many times and used the best one. Maybe you could use a lower learning rate and try different hyperparameters to find a relatively more stable ones.
Thank you for detailed and patient reply~ Yes, I add serveral custom picture, so I total use 189 pictures. Have you published the paper on arXiv about the single GPU analysis you mentioned above? Could you please share me the detailed version with the information you mentioned above :). I'll try to train with 4 GPUs next, since you let me know the importance of multi GPU~ Thank you again for your reply~
This single GPU analysis is in the supplementary material of our submission. We remove it in the arxiv version to make the paper more concise. Here is the original version:
Thank you for detailed and patient reply~ Yes, I add serveral custom picture, so I total use 189 pictures. Have you published the paper on arXiv about the single GPU analysis you mentioned above? Could you please share me the detailed version with the information you mentioned above :). I'll try to train with 4 GPUs next, since you let me know the importance of multi GPU~ Thank you again for your reply~
I am also trained dualstlegan at 1 V100 GPU with a style which have large discrapancy between real faces. And it overfiting with my style dataset. It is better after you train at 4GPU?
The official stylegan is trained on 4 batch * 8 GPUs. Therefore, I think it is better to use this setting to achieve the best performance. Obviously 4 GPUs are better than 1 GPU, and maybe you need to try different hyperparameters to find a relatively more stable ones on 4GPUs.
And also, you can save many checkpoints and choose the one just before overfiting.
The official stylegan is trained on 4 batch * 8 GPUs. Therefore, I think it is better to use this setting to achieve the best performance. Obviously 4 GPUs are better than 1 GPU, and maybe you need to try different hyperparameters to find a relatively more stable ones on 4GPUs.
And also, you can save many checkpoints and choose the one just before overfiting.
Thank you for your reply.
I think my stylegan2 is not over fine-tuning, but a bad result still got when do destyle. Is the style too difficult?
fine-tuned stylegan2:
some destyle results
I think the style is too abstract and different from real humans, thus the destylization results are not very ideal. I have two solutions:
- From the visual quality, I think you can just use the results in the second column as the destylization results for training dualstylegan. The quality is better than the final destylization results in the last column.
- Decrease the epoches for finetuning stylegan. Try different epoches and find a epoch that the finetuned stylegan has the best destylization results.
I think the style is too abstract and different from real humans, thus the destylization results are not very ideal. I have two solutions:
- From the visual quality, I think you can just use the results in the second column as the destylization results for training dualstylegan. The quality is better than the final destylization results in the last column.
- Decrease the epoches for finetuning stylegan. Try different epoches and find a epoch that the finetuned stylegan has the best destylization results.
Thanks. In fact I have paired dataset, so i will try to get instyle/exstyle code from paired data. If still not work, maybe the style is not suitable for dualstylegan cause it's abstraction.
@JuncFang-git hi,do you get a nice result by getting style code from paired data?
@JuncFang-git hi,do you get a nice result by getting style code from paired data?
No, I can't get a nice result.
I try to train the dualstylegan with paired instyle-exstyle code(both of them get from the psp encoder). Some checkpoint records are shown bleow:
0-iter
5500-iter
Then i test the model with some user photo. Some result are shown bleow:
@JuncFang-git Your results are interesting. It is quite strange that the model under 5500-iter produces both cartoon faces and real faces. I haven't seen this phenomenon before.
@JuncFang-git Your results are interesting. It is quite strange that the model under 5500-iter produces both cartoon faces and real faces. I haven't seen this phenomenon before.
Hahaha, It's interesting. The left part of DualStyleGAN is a complete psp structure after i used real-image's instyle code. Maybe it's more difficult to change the style by interpolate exstyle code. But i don't know why it produces both cartoon faces and real faces.
The official stylegan is trained on 4 batch * 8 GPUs. Therefore, I think it is better to use this setting to achieve the best performance. Obviously 4 GPUs are better than 1 GPU, and maybe you need to try different hyperparameters to find a relatively more stable ones on 4GPUs. And also, you can save many checkpoints and choose the one just before overfiting.
Thank you for your reply. I think my stylegan2 is not over fine-tuning, but a bad result still got when do destyle. Is the style too difficult? fine-tuned stylegan2:
some destyle results
@JuncFang-git Hi,I found your style very ”かわいい“,it's funny!The paired data you created by JOJOGAN ?why not train it by an end-to-end net ?