CycleGAN
CycleGAN copied to clipboard
Synthetic to Real Image
Hello. I tried the pytorch Cycle-Consistent GAN code for the task of translating synthetic foot images to real foot images.
I gathered my own training data and performed several experiments. Here are some results, with realA (synthetic) images on the left and fakeB (generated real) images on the right of each pair.
I intend to use the fakeB images to train a neural network for foot pose estimation and the generated data quality does not seem good enough for that goal. Qualitatively, 3/4 of the test fakeB images have issues such as deformity or unrealistic coloring.
Following the methodology in Mueller et al., I added a geometric consistency loss term to ensure that realA and fakeB images produce the same segmentation mask, therefore having the same pose.
In my most recent experiment, I used 10K synthetic images and 10K real images. I used the following flags:
python train.py --no_flip --resize_or_crop none --lambda_geometric 0.2 --niter 20 --niter_decay 20 --lr_decay_iters 8 --lambda_identity 0 --batch_size 1 --norm instance --model cycle_gan
All other flags set at default. Training 10K images on 1 GPU with the specified number of iterations took 4 days to train.
I hope you can give me advice on how to achieve better results:
- Does more training images equal better results? To generated real foot images, I video recorded my foot from different angles in front of a green screen, then extracted an image every 15 frames.
- Since I have so many training images, do you recommend using a higher batch size? I read that you believe batch size of 1 produce the best results.
- Any other settings/flags I should modify?
Cool results. A few comments:
- It seems that geometric consistency loss used in Muller et al. might be a good solution to the deformity issue. Does it help your case?
- You may want to use data argumentation: e.g., flipping, resizing, cropping (remove
no_flip
and useloadSize 286 fineSize 256
). - You can also try to do CycleGAN in small patches. You can use
loadSize 256 fineSize 128
orloadSize 256 fineSize 96
. - A bigger batchsize should speed up things. The results should be comparable. I think batchsize=1 works well for pix2pix with batchnorm. For CycleGAN, we use batchsize=1 mostly due to memory issue.
@junyanz I will try data augmentation. I am using one NVIDIA K80 gpu. I increased batch_size to 6. When batch_size = 1, each example in a batch takes ~1.2 to 1.3 seconds. Now, at batch_size = 6, each example in a batch takes ~0.8 to 0.9 seconds.
- Any way performance can be improved beyond this?
- For batch_size > 1, should instance normalization or batch normalization be used?
Looks good. You might want to use instancenorm.
Hi, I am working on a problem very similar to this, generating realistic images from synthetic ones. Can you please let me know if you have obtained better results by following the above tricks ? It would be really helpful.