About the bad results of stargan in your paper.
I found that in your paper in page.6, the results of stargan is really poor. I reproduced stargan in celeba-hq and got bad results too. But in stargan paper, it is good to generate 256x256... So i'm a little confused, (Once i train stargan in celeba, the results is good enough, but in celebahq, it cannot succeed even to reconstruct the image).
You noticed the interesting fact. Thank you for asking the question. To be brief, there are three settings: (1) StarGAN trained on CelebA (2) StarGAN trained on CelebA and RaFD (3) StarGAN trained on CelebA-HQ.
StarGAN paper introduced their good result by (2). RaFD contains numerous high quality faces (256x256) but it is a private dataset which we cannot access. What we can do is "to try to reproduce the 256x256 result with only CelebA", which is (1). (1) gives a not bad result. However, the difference between CelebA and CelebA-HQ is not only the quality but the quantity. CelebA has about 200,000 faces whilst CelebA-HQ has only 30,000 high quality faces.
We deduce that StarGAN suffers from the scarcity of high quality faces in (3). If we have CelebA-HQ and RaFD together, we might be able to get good result with StarGAN. On the other hand, our method takes advantage of matching-aware discriminator for the conditional labels, where StarGAN uses a classifier. It makes the GAN overcome the scarcity problem. So our method can still get good result with only 30,000 CelebA-HQ faces.