PyTorch-StudioGAN icon indicating copy to clipboard operation
PyTorch-StudioGAN copied to clipboard

Training ReACGAN with fewer classes and higher resolution images

Open festinais opened this issue 3 years ago • 7 comments

I initially trained ReACGAN with 7 classes and 64x64 resolution images, I could get fair generated samples with a FID of 40 (With 50% of training I could get a FID of 40).

Right now I'm training with only 3 classes and with 128x128 resolution images. I changed the config file for this particular number of classes and the z_embedding.
I was expecting to get better results with fewer classes and higher resolution. However, the fid is not converging yet. Now it's 40% of training and the FID is 152 - also the generated images still look unrealistic.

Any input/idea on why this might happen is appreciated! Thank you

festinais avatar Apr 04 '22 15:04 festinais

I was expecting to get better results with fewer classes and higher resolution. However, the fid is not converging yet. Now it's 40% of training and the FID is 152 - also the generated images still look unrealistic.

=> I think this phenomenon atttributes to two sources: (1) lack of training dataset and (2) improper hyperparameter selction. To solve the problem, I recommend turning on differentiable augmentations for data-efficient training. After that, tunning hyperparameters in your model will solve the problem.

Best,

Minguk

mingukkang avatar Apr 06 '22 09:04 mingukkang

Thank you for your feedback!

I tried training with all 7 classes with 128x128 resolution and FID is not converging even after 30% of training. The value is always above 200 for FID. However, when I trained with all 7 classes with 64x64 resolution, FID was converging and after 40% of training it reached the value of 45.

It is the same dataset with same classes in the end but just with different resolution. Do you have any idea why this might happen? Does it mean it is not able to generate high discriminable images?

Thank you!

festinais avatar Apr 08 '22 08:04 festinais

Just to give you a more detailed overview of the dataset. - MURA dataset: https://stanfordmlgroup.github.io/competitions/mura/ image

I'm trying to generate synthetic data for the underrepresented classes: humerus and forearm. There is also the problem of class overlap and that's why I'm trying to use ReACGAN to generate hard negatives with good fidelity.

festinais avatar Apr 08 '22 09:04 festinais

This is how the generated images look when I trained with all of the classes and 64x64 image size. The training completed only 50%, and I got best FID score: 45.60. I didn't use any augmentations or hyperparameter tuning. Screenshot 2022-04-08 at 11 34 18

Screenshot 2022-04-08 at 11 32 11

Note: One of the classes collapsed: Forearm (above in the generated samples it's visible)


Now I'm training again with all of the seven classes but with resolution 128x128.

This is how the FID is converging (not converging!) Screenshot 2022-04-08 at 11 37 03

And the generated samples: Screenshot 2022-04-08 at 11 38 12

So this is the summary of my two experiments. I want to know the reason behind this, I have some intuition but I'm not so sure. So before I move in using different augmentations that might be helpful, I wanted to first close this and be sure what's happening. I would highly appreciate your feedback! Thank you!

festinais avatar Apr 08 '22 09:04 festinais

Also, If you have any feedback on what type of augmentations to use or other ideas that would be helpful for this particular dataset I would highly appreciate it! Thank you.

festinais avatar Apr 08 '22 09:04 festinais

@festinais hi bro, any progresses?

DongChen06 avatar Aug 10 '22 14:08 DongChen06